Nucleotide sequence and regulation of the Escherichia coli gene for ferrienterobactin transport protein FepB

The Escherichia coli fepB gene encodes a periplasmic protein required for ferrienterobactin transport; four fepB-related polypeptides are resolved by standard sodium dodecyl sulfate-polyacrylamide gel electrophoresis. In vitro DNA-directed protein-synthesizing systems and experiments with the inhibitors dinitrophenol, carbonyl cyanide m-chlorophenylhydrazone, and ethanol demonstrated that the initial fepB translation product is processed. The nucleotide sequence of fepB and neighboring regions was determined. The predicted proFepB has a molecular weight of 34,255, consists of 318 amino acids, and is devoid of cysteine residues. A leader peptide is present, as are three possible leader peptidase cleavage sites after positions 22, 23, and 26. The upstream regulatory region included a Fur box, indicating that fepB is iron regulated, which was verified by RNA dot blot experiments. The regulatory region included a 68-amino-acid open reading frame (ORF) that encompassed a sequence capable of forming a large stem-and-loop structure. Indirect evidence indicated that this ORF must be translated for fepB transcription to occur. Six copies of the nonomer CCCTC(A/T)CCC or its invert were present in the stem-and-loop region. An ORF of unknown significance was found downstream from fepB; its product would have a molecular weight of 18,036 and be rich in proline and alanine. Processing of proFepB remains unclear, but the appearance of the three smaller members of the FepB family required the action of leader peptidase and the presence of the entire fepB gene.

Iron is required by all organisms except certain lactobacilli. Despite being abundant, iron is frequently not readily available to microbes for several reasons, including the poor solubility of Fe(III) at neutral and alkaline pHs and the presence of host iron-binding proteins such as lactoferrin and transferrin. Under iron-deficient conditions, many aerobic and facultatively anaerobic microbes synthesize and excrete siderophores [low-molecular-weight Fe(III)-chelating compounds] to solubilize and transport the metal (for a review, see reference 9). All Escherichia coli strains have a high-affinity iron uptake system that uses the siderophore enterobactin (Ent) (reviewed in reference 12). The synthesis of enzymes necessary for Ent biosynthesis is coordinately regulated with production of the transport proteins required to bring FeEnt complexes into the cell; the genes for all of the components specific for the Ent system map in a cluster at approximately 13.5 min.
Passage of FeEnt through the cell envelope requires an outer membrane receptor protein acting in concert with what appears to be a typical osmotic-shock-sensitive transport system. Three genes, fepA, fepB, and fepC, direct the synthesis of polypeptides specifically required for FeEnt uptake, and there is evidence for several additional fep (transport) genes (30). An 81,000-dalton outer membrane receptor protein for FeEnt is encoded by fepA, FepB is a periplasmic protein, and FepC is a cytoplasmic membrane protein that corresponds to the conserved nucleotidebinding protein (C. M (2). Other possible fep genes have been less rigorously studied, in part because of the difficulty in identifying their cognate polypeptides. By analogy with well-characterized shock-sensitive systems, these polypeptides could be anticipated to be highly hydrophobic and present in the cytoplasmic membrane in very small amounts (2). The tonB (16) and exbB (20) gene products are also required for normal FeEnt uptake; these polypeptides function in a variety of transport systems and presumably interact with surface receptors to mediate an energy-dependent step necessary for passage of ligands through the outer membrane (36).
The present study focused on the fepB gene and its product. Previous work (33) in which FepB was localized to the periplasm provided the first evidence that FeEnt uptake is accomplished by a periplasmic transport system and suggested that FepB has a crucial role in the process. It was also observed that the fepB gene is responsible for the appearance of four protein bands with molecular weights ranging from 31,500 to 36,500 in standard sodium dodecyl sulfate (SDS)-polyacrylamide gels, a result that has been confirmed (30). Here we report the deoxynucleotide sequence offepB and its upstream region and show that both leader peptidase and information at the FepB carboxy terminus are necessary if the family of fepB-related polypeptides is to be observed. A preliminary report describing aspects of this work has been presented (M. Elkins (33). Deletions in CsCl-purified pME200 and pME201 DNA were generated, after restriction with BamHI and SphI, with exonuclease III as described by Henikoff (21)  Labeled minicells were suspended in a minimal volume of distilled H20, diluted in 2x solubilization buffer, and boiled for 5 min, and then 100 to 200 kcpm was loaded onto 10% SDS-polyacrylamide gels (39). High-range molecular weight protein markers (Bethesda Research Laboratories, Inc.) 8 were used as standards. Gels were fixed and stained (14) and Promega exposed to Kodak XRP-1 film.

Corp.
In  (22), and 20 ,ul of 37% formaldehyde was heated to 60°C for 15 min. Then, 25 ,ul was added to the first two horizontal rows of a microdilution plate containing 25 ,ul of 15 x SSC in all wells except those in the first row. Alternating columns of RNA isolated from iron-replete and iron-starved cells were prepared in this manner. A series of 1:1 dilutions was prepared vertically on the plate beginning with the second row. An additional 125 ,u1 of 15x SSC was then added to each well and mixed, and the contents were spotted onto Genescreen (Du Pont, NEN Research Products). A sample of 106 trichloroacetic acidprecipitable counts of each probe was hybridized to a set of two columns of RNA spots, washed as described by instructions provided with Genescreen, and subsequently exposed to Kodak XR-P film. FepB-pME26

RESULTS
Subcloning and deletion analysis of the fepB region. Previous work (33) indicated thatfepB is present on a 4.0-kilobase (kb) EcoRV fragment of pCP111. This fragment was inserted into the EcoRV site of pACYC184 and the SmaI site of pGEM-3 Blue, yielding plasmids pME200 and pME201, respectively. As anticipated, both of these plasmids complemented strain DK214 (fepB ). Unidirectional digestion for various times with exonuclease III was carried out on each plasmid, and the resulting deletion derivatives were then recircularized. Plasmids from each time point were roughly sized by-restriction mapping, and then representative clones were transformed into DK214. Complementation of DK214 was tested by noting whether the presence of the plasmid (i) permitted growth on succinate-dipyridyl plates and (ii) resulted in a reduced halo size on chrome azurol S indicator agar. The results ( Fig. 1) indicated thatfepB was present on at most 1.5 kb of DNA.
Nucleotide sequence of fepB. Deletion derivatives of pME200 and pME201 were subcloned into M13 derivatives and sequenced by the chain termination method. The sequencing strategy is diagrammed in Fig. 2 and the results are displayed in Fig. 3. Computer analysis of this region revealed a single large open reading frame (ORF) with a GTG initiation codon (position 414) and a TAA termination codon (position 1368). A Shine-Dalgarno sequence was present, as was a downstream inverted repeat region that might function in transcription termination. This ORF (ORF2) was capable of encoding a 34,255-dalton polypeptide consisting of 318 amino acids. The predicted FepB protein has a signal peptide of at least 22 amino acids; three possible signal peptidase cleavage sites (38) were present. A leader peptide was also indicated by the hydropathy plot (M. F. Elkins, Ph.D. thesis, University of Texas at Austin, Austin, 1988). The calculated molecular weight for the initial translation product corresponds well with that determined by SDS-polyacrylamide gel electrophoresis for the largestfepB-encoded polypeptide (36,500) (33) and is in the molecular weight range (25,000 to 56,000) typical of periplasmic binding proteins (2,17). The sequence also predicts that FepB is devoid of cysteine residues.
The sequence upstream from fepB contains the promoter elements for both fepB and entC (13,29). This 417-nucleotide sequence contained two binding sites (Fur boxes) for the regulatory protein Fur; these 19-bp palindromic sequences occur upstream from many iron-regulated genes, and transcription of these genes is reduced upon binding of the Restriction and physical maps of pCP111 and several of its derivatives. Vector DNA is not shown. pME200 and pME201 contain a 4-kb EcoRV fragment isolated from pCP111 and differ only in that the fragment was subcloned into the EcoRV site of pACYC184 and the SmaI site of pGEM-3 Blue, respectively. Tn5 insertions were generated in pCP111 but are indicated by arrows on the pME200-pME201 map for precision. Physical maps of some deletions generated by exonuclease III are indicated by slanted lines for those made in pME200 and by solid lines for those made in pME201. The ability of these plasmids to complement DK214 (fepB) is shown on the right. Abbreviations for restriction sites are as follows: E, EcoRI; Hc, HincII; Hp, HpaI; P, PstI; V, EcoRV.
The region upstream from fepB also contains a sequence capable of forming extensive stem-and-loop structures, one of which is shown in Fig. 4. This region of possible secondary structure is located within a 68-amino-acid ORF (ORF1) (Fig. 3, positions 204 to 408) whose carboxy terminus ends in the likely Shine-Dalgarno sequence (401 to 406) forfepiB.  A comparison of the fepB nucleotide sequence with those listed in Genebank did not reveal any significant homologies. However, homologies to the upstream region capable of stem-and-loop formation were found in sequences downstream from E. coli genes pstA (phoT) (1) and adk (6) and Bradyrhizobiumjaponicum hemA (23). As expected, each of these three sequences could be arranged so as to exhibit secondary structure, with the largest such predicted structure found downstream from hemA. The regions of homology contained a repeated 9-bp consensus sequence of CCCTC(A/T)CCC (or its inverted repeat GGG[C/ A]GAGGG) which was found six times in regions adjacent to fepB and hemA and three times adjacent to pstA and adk (Fig. 5).
An ORF (ORF3) capable of encoding an 18,036-dalton polypeptide began 175 bp downstream fromfepB (Fig. 3). Of the 174 amino acids encoded by this ORF, 25 are proline and 25 are alanine. No protein encoded by this region has been detected, and there are no obvious promoter or Shine-Dalgarno sequences preceding this ORF. Also, Tn5 inser-tions 785 and 770 (Fig. 1) yielded no phenotype when placed in the chromosome of strain JC7623 by transformation, whereas 414 inactivated FepB.
Regulation of fepB transcription. The presence of a Fur box upstream from fepB indicated that fepB transcription was controlled in part by the amount of available iron. Accordingly, dot blot experiments were performed on RNA isolated from iron-starved and iron-replete cells. RNA probes were transcribed from pME14, which carries the 750-bp HincII-HpaI fragment (Fig. 1) containing the fepB promoter region and approximately the first half offepB. As anticipated, the probe made from the noncoding strand preferentially bound to RNA from cells grown under irondeficient conditions (Fig. 6).
Processing of FepB. The periplasmic location of FepB and fepB sequence data suggested that FepB is initially synthesized as a proprotein. To test this idea, the effects of a variety of processing inhibitors on minicell expression of fepB were examined. Ethanol, carbonyl cyanide m-chlorophenylhydrazone, and dinitrophenol resulted in the accumulation of the largest fepB-encoded polypeptide ( Fig. 7A and B). (The precursor of the periplasmic protein P-lactamase was also enriched [ Fig. 7A].) Furthermore, plasmid pME200 products were analyzed in an in vitro transcription and translation system (Fig. 7C). Only the largest fepB-related polypeptide was synthesized (lane 1), which agreed with previous results (33). In the presence of leader peptidase, however, the three smaller FepB polypeptides were seen (lane 2). The addition of leader peptidase after transcription and translation of proFepB had occurred yielded the same results (data not shown). These results demonstrated that processing of FepB occurs and suggested that leader peptidase activity is necessary for the appearance of the three smaller fepB-related polypeptides.
Polypeptides encoded by several derivatives of pME200 and pME201. Sequence analysis indicated that the deletion in pME200 derivative pME13.18 ended 1 bp beyond the fepB TAA termination codon. Also, pME13.47 was predicted to be deleted for the carboxy-terminal 16 amino acids of FepB and in their place to have 12 amino acids (CSTASTYY WAAS) directed by the noncoding strand of the pACYC184 tet gene (40). A similar fusion was predicted for pME13.31, with the carboxy-terminal 86 amino acids for FepB replaced by 14 amino acids (PQPTTGLLPNAGVA). The polypeptides produced by these plasmids (Fig. 8A) are consistent with the sequence data; pME13.18 directed the synthesis of all four fepB-related polypeptides, whereas pME13.47 and pME13.31 each apparently produced just one truncated protein related to FepB and these were of the expected size. Similarly, no FepB protein was synthesized by pME203, which contains DNA only for the carboxy terminus of FepB and the region downstream fromfepB, and a single truncated FepB protein is directed by pME204, which lacks the information for the carboxy-terminal 64 amino acids of FepB. The presumed FepB protein encoded by pME204 migrated faster than that of pME13.31, which was not expected. This discrepancy may be an artifact of gel electrophoresis or, less likely, indicate that the pME204 product is processed. In summary, the size and number of fepB-FIG. 3. Nucleotide sequence of the coding and regulatory regions offepB. Three ORFs are shown, with the corresponding amino acid sequences indicated below them; ORF2 is fepB. Underlined sequences indicate two potential entC translational start sites, and those doubly underlined are Fur boxes. A region capable of forming a large stem-and-loop structure is indicated by arrows above the sequence, with internal CCCTC(A/T)CCC or its invert consensus sequences marked with a dotted underline. An inverted repeat downstream from fepB is also shown by arrows above the sequence, restriction enzyme sites are indicated, and rbs designates a ribosome-binding site. related polypeptides observed with plasmid derivatives of pME200 support the sequencing data with respect to fepB size and direction of transcription. They also provided further evidence that the intact fepB gene is responsible for the appearance of four polypeptides (compare pME13.18 with pME13.47) and indicate that the carboxy terminus of FepB is necessary for the appearance of multiple bands.
The proteins encoded by three deletion derivatives of pME201, allfepB, were examined (Fig. 8B). pME26, which was deleted up to nucleotide 232 (Fig. 3) and therefore lacked the fepB Fur box and the initial portion of ORFi but contained the stem-and-loop region and the entirefepB gene, produced no FepB protein. In pME28 the deletion extended into the region of predicted leader peptidase cleavage sites (nucleotide 486, Fig. 3); a fusion in which the first 24 amino acids of proFepB were replaced by 18 amino acids (MTM ITPSYLGDTIEYSSL) originating from lacZ and encoded in part by the SP6 promoter and multiple cloning site of pGEM-3 Blue occurred. This plasmid directed the synthesis of two fepB-related polypeptides. The same two polypeptides were found when pME28 was used in an in vitro transcription and translation system, and leader peptidase I 34 FIG. 6. Autoradiogram of dot blot analysis of RNA isolated from AB3311 grown under iron-replete conditions (columns 1 and 3) and iron-starved conditions (columns 2 and 4). RNA was probed with 35 S-labeled RNA generated with SP6 (columns 1 and 2) or T7 (columns 3 and 4) RNA polymerase, using pME14 DNA as a template. Columns 1 and 2 represent hybridization to RNA made from the noncoding strand offepB, and columns 3 and 4 represent hybridization to RNA made from the coding strand. had no effect on either of them (data not shown). No FepB protein was observed with pME36, which was deleted well into the fepB gene.
Data in Fig. 8C provided additional evidence that fepB sequencing data were correct in that no FepB was observed when [35S]cysteine was used as a radiolabel.

DISCUSSION
The sequence data predicted that the immediate product of fepB was a 34,255-dalton polypeptide, a size consistent with polyacrylamide gel electrophoresis data. The validity of the sequencing result is further supported by (i) the complementing activity and sizes of the polypeptides encoded by the various fepB subclones, (ii) evidence for a leader peptide, and (iii) the fact that no cysteine residues were predicted and none were found in FepB. (The absence of cysteine in periplasmic binding proteins is not unusual; neither the maltose [11]nor the ribose-binding proteins [19] contain cysteine.) The basis for the multiple fepB-related products remains unresolved. The largest polypeptide was proFepB; it was the only product observed in in vitro systems, it is membrane associated (33), it accumulated in the presence of agents that slow processing, and the fepB sequence indicated the presence of a leader peptide which is necessary for exported proteins such as FepB. In vitro experiments with leader peptidase indicated that this enzyme was necessary for the appearance of the three smallerfepB products. This datum argues against the possibilities that one or more of the bands resulted from a translational pause, were processed intermediates in translocation (26), or arose from defective processing in minicells (43). The absence of cysteine in FepB eliminated the possibility that the multiple forms arose from incomplete reduction of disulfide bonds or from conformational changes arising from the gain or loss of disulfide bonds during transport (27,35). Plasmids pME13.47, pME13.31, and pME204, which contained deletions of the 3' end of fepB, each produced one fepB-related protein. In conjunction with the leader peptidase results, this suggests that proFepB is processed posttranslationally and that the FepB carboxy terminus is necessary for the maintenance of the precursor in a transport-competent conformation (5). Three possible signal peptidase cleavage sites were found in proFepB, and three smaller polypeptides result from treatment of proFepB with this enzyme. The remote possibility that signal peptidase acts at several sites on proFepB cannot be eliminated at this time. The significant size differences among the three smaller polypeptides could result from artifacts of SDS-polyacrylamide gel electrophoresis; in a gel system that contains urea (33), these polypeptides are not well separated. Resolution of this problem will presumably be attained by use of anti-FepB antibodies to determine whether all four bands can be detected in normal cells and by determining the amino-terminal sequence of each of the polypeptides.
That only a relatively small ORF (ORF3) was detected downstream from fepB and preliminary evidence that TnS insertions in this ORF region caused no transport defect were surprising. In periplasmic transport systems the binding protein gene is generally part of an operon that also includes genes for membrane-bound components (2). Also, Ozenberger et al. (30) found a gene (fepD) in this region, although it was later reported that the two insertional mutations defining fepD were phenotypically unstable (Shea et al., Abstr. Annu. Meet. Am. Soc. Microbiol. 1988) and indicated that it was cotranscribed with fepB. Independent evidence regarding the accuracy of the ORF3 sequencing data is not available as no protein products from this region have been detected. Any gene immediately downstream from fepB cannot be large, however, as 200 bp downstream from the ORF is a region of multiple translation termination sites in the three reading frames corresponding to the fepB direction of transcription. Also, a Fur site exists approximately 1,350 bp downstream fromfepB, suggesting a limit to the length of a possible polycistronic message originating from the fepB promoter (Elkins, Ph.D. thesis). This Fur site apparently regulates the fepG and fepC genes (S. S. Chenault, unpublished observations). It seems unlikely that the E. coli chromosome would contain 1,350 bp with no function in the middle of the Ent gene cluster; additional mutational and sequencing experiments are under way to study the anomalies associated with this region.
The regulatory region between fepB and entC contains divergent promoters and two Fur boxes, as has been reported previously (13,29). We presume that the fepB promoter overlaps its Fur box, as is generally the situation for genes immediately downstream from Fur boxes, in which case fepB has a long leader RNA that contains ORF1. Two other iron-regulated genes, cir (18) and entF (32), also have such an arrangement. The fepB leader RNA differs, however, in that its leader RNA included a region capable of much secondary structure. The largest possible stem-andloop structure contained six copies of the nonomer CCCTC(A/T)CCC or its inverted repeat and formed a stem containing 30 bp; the free energy of this structure is approximately -69 and -43 kcal (ca. -289 and -180 kJ)/mol at 25°C (44) and 37°C (15), respectively. A variety of smaller stem-and-loop structures were also possible. Neither the significance of ORF1 nor the possible secondary structure region is known. However, indirect evidence from pME26 and pME28 studies suggested that ORF1 must be translated iffepB is to be expressed. pME26 produced no FepB despite having an intact fepB gene and a good vector-supplied promoter. In contrast, two fepB-related proteins were expressed by pME28. (That two polypeptides were observed with pME28 suggests that two vector promoters, perhaps the SP6 promoter as well as the lac promoter, were functioning.) The difference between pME26 and pME28 is that pME26 could synthesize leader RNA containing the upstream region of potential secondary structures but has lost both possible translational initiation codons for ORF1, while in pME28 the entire upstream region and the early portion of ORF2 are deleted. In pME26, transcription may be initiated but prematurely terminated because of an inability to carry out concomitant translation. In pME28, no upstream transcriptional terminator can be synthesized. How translation of ORF1 could act to regulate fepB transcription normally is unclear. Slowing of translation resulting from low levels of iron-related amino acids such as phenylalanine, leucine, serine, tyrosine, cysteine, and tryptophan would decrease, not increase, transcription. (Ent biosynthesis originates with chorismate, the common precursor for tryptophan, tyrosine, and phenylalanine; serine is required for the latter stages of Ent synthesis; and iron limitation effects modifications of tRNAs whose codons start with U (phenylalanine, leucine, serine, tyrosine, cysteine, and tryptophan [4]). In any case, only 12 of the 68 codons are for iron-related amino acids, there are no tyrosine or tryptophan codons in ORF1, and the iron-related amino acids are not clustered so a control mechanism based on these amino acids seems unlikely. Other mechanisms by which the dyad symmetry region could function in regulation are possible, for example, as a binding site for a regulatory protein, but further speculation seems premature. The homologies found with other intergenic sequences suggest possible links between the nonomer triplets and both (i) iron metabolism (hemA encodes 5aminolevulinic acid synthase, and adk is adjacent to hemH [ferrochelatase]) and (ii) binding protein-dependent transport systems (pstA encodes an integral membrane protein of the high-affinity phosphate transport system). A determination of the significance, if any, of the homologies will presumably require a better understanding of the complex interrelationships between iron assimilation and overall cell metabolism.  [745][746][747][748][749]1989; Stock, personal communication). proFepB has almost identically spaced residues at positions Asp-81, Asp-82, Asp-127, and Lys-178. Because effector proteins are phosphorylated, the possibility that FepB is phosphorylated and that this might account for the multiple forms of FepB was tested in a double-label experiment with 32Pand 35S-methionine. No 32p label was found associated with any of the four FepB polypeptides, indicating that phosphorylation is not responsible for the unusual number of FepB proteins and that, if it occurs, it is transitory.
LITERATURE CITED