Previous Article | Next Article ![]()
Journal of Bacteriology, November 2008, p. 6948-6960, Vol. 190, No. 21
0021-9193/08/$08.00+0 doi:10.1128/JB.00625-08
Copyright © 2008, American Society for Microbiology. All Rights Reserved.
,
Division of Bioenvironmental Science, Frontier Science Research Center,1 Division of Microbiology, Department of Infectious Diseases, Faculty of Medicine, University of Miyazaki, Miyazaki, Japan,2 Division of Applied Bacteriology, Graduate School of Medicine,3 Department of Molecular Bacteriology, Research Institute for Microbial Diseases, Osaka University, Osaka, Japan,4 Department of Biological Information, School and Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, Tokyo, Japan,5 Kitasato Institute for Life Science, Kitasato University, Kanagawa, Japan,6 RIKEN Genomic Sciences Center, Kanagawa, Japan,7 Department of Computational Biology, Graduate School of Frontier Sciences, University of Tokyo, Chiba, Japan8
Received 6 May 2008/ Accepted 18 August 2008
|
|
|---|
|
|
|---|
Escherichia coli comprises genotypically and phenotypically divergent strains. A fraction of the strains cause diverse intestinal and extraintestinal diseases in humans by means of individually acquired virulence factors (15). So far, the whole genome sequences of nine E. coli strains, including three benign laboratory strains (K-12 strains MG1655, W3110, and DH10B) and six pathogenic strains (two enterohemorrhagic E. coli [EHEC] O157, three uropathogenic E. coli, and one avian pathogenic E. coli strain), have been published (5, 6, 10, 16, 23-25, 41, 53). Comparisons of these genome sequences have revealed that each genome contains a large amount of strain-specific sequences. For instance, a comparison of EHEC O157 strain RIMD 0509952 (referred to as O157 Sakai) and K-12 MG1655 has revealed that they share a total of 4.1 Mb of sequences but that O157 Sakai and K-12 each contains 1.4 Mb and 0.5 Mb of strain-specific sequences (referred to as S loops and K loops, respectively) (24). Importantly, most of the large S loops and K loops are prophages or integrative elements (genetic elements that contain integrase genes but no other genes related to phages and conjugal transfer functions). In O157 Sakai, 18 prophages and 6 integrative elements were identified, and they carry most of the virulence-related genes of O157.
Enteropathogenic E. coli (EPEC) is a major cause of infant diarrhea in nonindustrialized countries (9). Typical EPEC strains are defined by the presence of the locus of enterocyte effacement (LEE) and the EPEC adherence factor (EAF) plasmid. The EAF plasmid encodes bundle-forming pili that are required for bacterium-bacterium interaction and microcolony formation (11, 19). The LEE encodes a set of proteins constituting a type III secretion system (T3SS) machinery, several effector proteins secreted by the T3SS, an adhesin called "intimin," and others (27). LEEs are also possessed by EHEC strains. In O157 Sakai, we recently found that in addition to the seven LEE-encoded effectors, 32 proteins encoded in non-LEE loci are secreted by the LEE-encoded T3SS (referred to as non-LEE-encoded effectors) (48). Almost all of these non-LEE-encoded effectors have been carried over into the O157 genome by prophages and integrative elements. Although the functions of most of the newly identified effectors have yet to be elucidated, they are believed to be required for O157 to express its full virulence. In EPEC, however, a limited number of non-LEE-encoded effectors have been identified so far (43). This is partly because no systematic search of GEIs and the virulence-related genes carried therein has been done in EPEC strains.
EPEC strain B171-8 is one of the prototype strains of typical EPEC. The whole sequence of the EAF plasmid of EPEC B171-8 has already been published (49). In this study, we analyzed the genome of EPEC B171-8 by the combined use of whole-genome PCR scanning (WGPScanning) (40) and fosmid mapping and identified a total of 22 large GEIs (>10 kb). On these GEIs, a set of proteins for the T3SS, 33 T3SS effector homologues, and 12 other potential virulence factors were identified.
|
|
|---|
(mmr-hsdRMS-mcrBC)
80dlacZ
M15
lacX74 recA1 endA1 araD139
(ara,leu)7697 galU galK
– rpsL nupG trfA tonA dhfr] (Epicentre, Madison, WI) was used as the host strain for fosmid library construction. For genomic DNA extraction, cells were grown to stationary phase at 37°C in Luria-Bertani (LB) medium, and the genomic DNA was purified using a Genomic-tip 100/G and genomic DNA buffer set (Qiagen, Tokyo, Japan) in accordance with the manufacturer's instructions. For fosmid DNA extraction, cells were grown for 5 h to early stationary phase at 37°C in LB medium containing chloramphenicol (12.5 µg/ml) and CopyControl induction solution (1 µl in 1 ml LB medium; Epicentre). The fosmid DNA was isolated using a Kurabo PI-1100 automatic DNA isolation system (Kurabo Industries, Ltd., Japan).
WGPScanning analysis. WGPScanning analysis of EPEC B171-8 was performed as previously described (40), with some modifications. In brief, we used 453 pairs of primers that cover the chromosome backbone shared by K-12 and O157 Sakai. All of the primer sequences are available at our website (http://genome.naist.jp/bacteria/o157/pcrscan.html). Long PCR was performed using an LA Taq PCR kit (Takara Shuzo, Kyoto, Japan), with 1 ng of genomic DNA as the template, with 30 cycles of a two-step amplification program of 20 s at 98°C and 16 min at 69°C. The PCR products were separated by field inversion gel electrophoresis, and the product sizes were estimated by a lane analyzer (Atto Corp., Tokyo, Japan).
Construction of fosmid library. A fosmid library for EPEC B171-8 was constructed using a CopyControl fosmid library production kit (Epicentre) in accordance with the manufacturer's instructions, with some modifications. The genomic DNA was sheared using a 26-gauge syringe. After blunting and phosphorylation of the DNA fragments, the fragments were separated by pulsed-field gel electrophoresis with 1% certified low-melt agarose (Bio-Rad Laboratories, Inc., Tokyo, Japan). Fragments of around 40 kb were recovered from the gel by using GELase (Epicentre). After the DNA solution was concentrated by means of a Microcon YM-100 filter (Millipore), the fragments were ligated with the pCC1FOS vector, packaged into phage particles, and transfected into E. coli EPI300-T1 cells. We collected 2,880 chloramphenicol-resistant clones and used them for fosmid mapping analysis.
End sequencing of fosmid clones. The end sequences of the fosmid clones were determined using the Fos-F (5'-TCCCAGTCACGACGTTG-3') and Fos-R (5'-ACCATGATTACGCCAAGC-3') primers, with approximately 50 ng of fosmid DNA as the template. All base calls with a quality value of <20 in the phred program (18) were removed from each read. We used only reads longer than 200 bp for further analysis.
The sequences of the fosmids or PCR products were determined by the standard random shotgun strategy. To selectively sequence the passenger regions of the lambda-like phages, each region was amplified by long PCR, using a relevant fosmid clone as the template and primers separately targeting the phage tail region and the chromosomal backbone. The fosmid clones and primers used are listed in Table S1 in the supplemental material. An ABI3730 automated sequencer (Applied Biosystems, Foster City, CA) was used for the sequence collection. Sequencher software (Gene Cord Corp., Ann Arbor, MI) was used for the assembly of the shotgun reads.
Translocation assay. For the translocation assay of T3SS effector candidates, we employed the TEM-1 β-lactamase system (8). The first 300 nucleotides of each gene were amplified by PCR. N-terminal translational fusion with the TEM-1 part of each PCR product was done using the pKTEM vector (kindly provided by Eric Oswald, INRA-ENVT, Toulouse, France), a pBBR1MCS2 derivative containing the TEM-1 gene at the XhoI-XbaI site (30). The recombinant plasmids were introduced into B171-8 cells and used for the translocation assay. All primers used to construct the fusion genes are listed in Table S2 in the supplemental material.
A translocation assay was performed as described by Charpentier and Oswald (8). HeLa cells grown on EZView glass-bottomed culture plates (LB 24-well plates; Iwaki, Tokyo, Japan) were infected with B171-8 cells containing each recombinant plasmid. After incubation at 37°C in 5% CO2 for 1.5 h, the cell cultures were washed three times with phosphate-buffered saline (PBS) and covered with 200 µl of CCF2/AM solution (Invitrogen). After a 1.5-h incubation in the dark at room temperature, the cells were washed three times with PBS. Each well was soaked in 500 µl of PBS and inspected with a Radiance2100 confocal laser scanning microscope (Bio-Rad Laboratories, Inc.). The fluorescence intensity (emission wavelengths, 460 and 530 nm) of each well was also scanned by a SpectraMax GeminiXS fluorescence microplate reader (Molecular Devices Corporation, Sunnyvale, CA).
Nucleotide sequence accession numbers. All of the DNA sequences determined in this study have been submitted to the DDBJ/GenBank/EMBL database (accession numbers AB426048 to AB426064).
|
|
|---|
![]() ![]() View larger version (72K): [in a new window] |
FIG. 1. Summary of WGPScanning and fosmid mapping analyses of EPEC B171-8. The results of the WGPScanning analysis of EPEC B171-8 are shown in the upper part of each segment. The initial data from the WGPScanning analysis are presented on the upper line, and the final results incorporating the data obtained by additional PCR analyses are shown on the lower line. Those segments which yielded PCR products with the same sizes as those from the reference strain (K-12) are indicated in gray, and those that were not amplified are shown in red. Those segments showing size reductions and size increments are indicated in dark blue and yellow, respectively. The results of fosmid mapping are shown in the bottom part of each panel. The positions of the end sequences of each fosmid clone on the K-12 chromosome are indicated by red (sense) and blue (antisense) arrowheads. The fosmid clones, whose end sequences were mapped on the K-12 chromosome, were classified into three groups (light blue bars, <29 kb; black bars, 29 to 48 kb; and pink bars, >48 kb). The positions of the K loops, regions deleted in B171-8, prophages, and other large GEIs integrated in the B171-8 chromosome are also indicated. The data from the first half and second half of the chromosome are shown in panels A and B, respectively.
|
Fosmid mapping analysis. We constructed a fosmid library of the EPEC B171-8 genomic DNA, collected 2,880 clones, and used them for mapping analysis of the K-12 chromosome. Our analysis of 95 clones randomly selected from the 2,880 clones indicated that their insert sizes ranged from 31.8 to 49.0 kb (40.5 kb on average) (see Fig. S1 in the supplemental material). The end sequences of the fosmid clones were determined, and 2,410 clones, from which high-quality sequences (>200 bp) were obtained from both ends, were used for the mapping analysis. Since the chromosome size of EPEC B171-8 was estimated to be about 5,250 kb by pulsed-field gel electrophoresis analysis of I-CeuI-digested genomic DNA (data not shown), the final coverage was 18.6 times (Table 1).
|
View this table: [in a new window] |
TABLE 1. Summary of B171-8 fosmid library
|
90% identity to the K-12 sequence, implying that about 20% of the B171-8 genome sequence (approximately 1 Mb) is absent in K-12. Of the 3,923 end sequences that were mapped to the K-12 chromosome (threshold,
90% identity and
30% coverage), 3,791 had single hits to the K-12 chromosome and the others hit multiple loci (Fig. 1). For 1,507 clones (62.5% of the 2,410 clones), both end sequences showed single hits (Table 2; see Table S4 in the supplemental material). Using the distances between their end sequences on the K-12 chromosome, we classified these clones into the following four groups: "short" clones (the distance was <29 kb), "normal" clones (29 to 48 kb), "long" clones (48 to 100 kb), and "superlong" clones (>100 kb) (Fig. 1; see Fig. S2 in the supplemental material). By examining the locations of the "long" clones, we identified 16 regions where large deletions have taken place (Fig. 1). All of these regions exhibited a large size reduction in the WGPScanning analysis.
|
View this table: [in a new window] |
TABLE 2. Summary of FASTA search of fosmid end sequences against the K-12 chromosome
|
By searching for genomic regions that were covered by "short" clones or clones for which only one end sequence was mapped (referred to as "single-end-mapped" clones), we identified 22 regions which contained large insertions (Fig. 1). These regions corresponded to the unamplified segments or the segment with a 15.2-kb increment identified in the WGPScanning analysis. Among the 31 unamplified segments, 27 were associated with the insertion of 21 large GEIs and 2 were associated with the large inversion described above (Fig. 1). The remaining two unamplified segments were covered by multiple "normal" clones, which made it unlikely that these segments would contain large GEIs. These results indicate that we most probably identified all of the large (>10 kb) GEIs of EPEC B171-8 by the combination of WGPScanning and fosmid mapping analyses.
The results of the BlastX search of no-hit end sequences of "single-end-mapped" clones against the GenBank nr database indicate that 13 of the 22 GEIs are prophages (nine lambda-like phages, three P2-like phages, and one phage of an unknown phage group). The others are the LEE, two fimbrial biosynthesis operons, a lipopolysaccharide biosynthesis operon, the IAHP cluster (13), and ETT2 (E. coli type III secretion system 2), carrying the second set of T3SS genes that are widely distributed in E. coli (24). The origins of the remaining three GEIs were not predicted (Table 3; see Table S5 in the supplemental material).
|
View this table: [in a new window] |
TABLE 3. Large GEIs identified in EPEC B171-8a
|
Comparison of the LEE of EPEC B171-8 with other sequenced LEEs. The LEE of B171-8 is 58 kb long and is integrated into the 3' end of the pheV tRNA gene. The gene organization of the LEE core region of B171-8 is almost identical to those of other sequenced LEEs (14, 17, 20, 24, 26, 47, 55) (Fig. 2). The LEE core region of B171-8 is associated with additional flanking sequences, as in some of the other LEEs.
![]() View larger version (26K): [in a new window] |
FIG. 2. Comparison of the LEE of B171-8 with other sequenced LEEs. The gene organizations of the LEE of EPEC B171-8 and eight other sequenced LEEs are shown. A part of SpLE1, a large integrative element of O157 Sakai, is very similar to a part of the right flanking region of the B171-8 LEE. The genes on the two elements exhibiting high similarity ( 90% identity with 90% alignment) are indicated by dotted lines.
|
EELs of lambda-like prophages. We found that seven of the nine lambda-like prophages carry virulence-related genes in their EELs. These include genes for 20 T3SS effectors or effector homologues (five of them were pseudogenes), OmpT, and alpha/beta hydrolase (Fig. 3 and Table 4). In addition, at least four lambda-like prophages encode Lom homologues, which may be involved in adherence to human epithelial cells or survival in macrophages (1, 2, 51). It is noteworthy that most of these genes or their close homologues are also present in O157 Sakai, with a few exceptions, such as the Cif and OspB homologues.
![]() View larger version (32K): [in a new window] |
FIG. 3. Gene maps of the EELs of nine lambda-like prophages identified in B171-8. The gene organizations and GC contents of the EELs of the nine lambda-like prophages identified in B171-8 are shown.
|
|
View this table: [in a new window] |
TABLE 4. Potential virulence factors identified in EPEC B171-8
|
![]() View larger version (22K): [in a new window] |
FIG. 4. Comparison of the three P2-like prophages identified in B171-8. (A) The gene organizations of the P2 phage and three P2-like prophages of B171-8 are shown. Seventeen gene families that are not present in phage P2 but are present in other P2 family members are indicated by numbers. Gene family 14 is comprised of invertase genes involved in the inversion of tail fiber regions. (B) Dot plot matrices of the nucleotide sequences of the P2 phage and the three P2-like prophages of EPEC B171-8 are shown. The vertical arrows indicate prophage regions where genes for immunity-related proteins and integrases reside.
|
Other integrative elements and a phage remnant. GEI 3.21 is an integrative element integrated in the 3' end of the ileX tRNA gene. It contains three IS elements and genes for two AlpA-type transcriptional regulators and a LifA/Efa1 adhesin-like protein (Fig. 3). Most of the other genes are also conserved in some pathogenic E. coli strains, but their functions are unknown.
GEI 4.36 is an integrative element integrated into the pheU tRNA gene. The integrase gene of the GEI is identical to that of SpLE3 of O157 Sakai, which is integrated into the pheV tRNA gene on the O157 Sakai chromosome. Furthermore, a part of the GEI is almost identical to a part of SpLE3 (Fig. 3). Both elements contain numerous IS elements and encode three T3SS effectors (EspL2, NleB, and NleE; NleB is split in EPEC B171-8 but intact in O157 Sakai), a PagC-like protein, and an interrupted LifA/Efa1 adhesin-like protein. The 26-kb region of GEI 4.36 that is not present in SpLE3 encodes a small GTP-binding protein and an AidA-I adhesin-like protein.
GEI 4.52 also seems to be an integrative element integrated into the leuX tRNA gene, where an integrative element (KpLE2) is integrated in K-12. Because the left part of GEI 4.52 showed no structural difference from KpLE2 in the WGPScanning and fosmid mapping analyses, we determined the sequence of the right part of GEI 4.52, a 39-kb region between the fecI and yjhS genes. While KpLE2 contains 18 genes for various enzymes, transporters, and regulators in the fecI to yjhS region, the integrative element GEI 4.52 encodes restriction-modification systems, a small GTP-binding protein of unknown function, an AidA-I-like adhesin-like protein, an Hha homologue, and an AlpA-type transcriptional regulator. This region also contains many IS elements and a set of genes for hypothetical proteins. Similar sets of hypothetical genes are present on GEI 4.36 (Fig. 3) and also on three prophage-like elements of K-12 (Cp4-6, Cp4-44, and Cp4-57).
GEI 1.31 is a highly degraded prophage remnant. This prophage contains no virulence-related genes.
Translocation assay of T3SS effector homologues and GEI-encoded proteins of unknown function. Among the genes for 26 T3SS effectors or effector homologues identified in non-LEE loci (including the regions flanking the LEE core), 19 appear to encode intact proteins, while 7 are apparently pseudogenes (Table 4). In addition, EPEC B171-8 contains one gene encoding an EspM homologue (originally annotated as trcP) which resides on plasmid pB171 (49), but this gene has also been inactivated by an IS element inserted into the 5'-end region.
Of the 19 intact effector homologues encoded by the non-LEE loci, 17 are homologous to the effector proteins of O157 (48): 7 are nearly identical to their O157 counterparts, but 10 show lower amino acid sequence identities (38 to 92%). Of the two that are not present in O157, the Cif protein of B171-8 is 99% identical in amino acid sequence to that of the rabbit EPEC strain E22 (31), but the sequence of the OspB homologue is only 34% identical to the OspB protein of Shigella sonnei strain Ss046. In addition, many genes of unknown function, identified in the EELs of the lambda-like prophages and on the other prophages and integrative elements, could include new effector genes.
We therefore examined whether or not the 10 divergent effector homologues are translocated into host cells by using the TEM-1 assay system as described in Materials and Methods. TccP2 was used as a positive control in this assay (54). Forty-seven proteins of unknown function, encoded by the lambda-like prophages, P2-like prophages, and other sequenced GEIs, were also examined. Among the 10 effector homologues examined, 7 (NleH and NleG from GEI 0.81, EspM from GEI 1.18, NleG from GEI 2.21, NleG and EspM from GEI 3.10, and OspB from GEI 4.26) were clearly translocated into HeLa cells from B171-8 (see Table S6 in the supplemental material). We did not observe a clear translocation of NleA from GEI 0.81, NleF from GEI 2.04, and EspN from GEI 2.21, but these proteins will need to be examined by means of different assay systems. In contrast, none of the 47 proteins of unknown function were translocated in this assay.
|
|
|---|
Nine of the 13 prophages identified on the B171-8 chromosome are lambda-like phages, and 3 are P2-like phages. The three P2-like phages exhibit a surprisingly high level of similarity to each other (Fig. 4). Multiple lysogenization of very similar bacteriophages has been described for the lambda-like phages of O157 (24, 39). Our present data indicate that multiple P2-like phages that contain nearly identical genome sequences can also be lysogenized and stably maintained in some E. coli strains. The replacement or sequence diversification of genes for phage immunity and integrase may allow this in B171-8 (Fig. 4). The sequence similarity of the lambda-like prophages of B171-8 is unknown. However, like the lambda-like phages of O157 (24, 48), they carry numerous virulence-related genes, especially those for non-LEE-encoded T3SS effectors, in the EELs (Fig. 3 and Table 4). One of the integrative elements identified is the LEE of this strain (Fig. 2), and other integrative elements also carry many virulence-related genes, including those for non-LEE-encoded T3SS effectors and nonfimbrial adhesins (Fig. 5 and Table 4).
![]() View larger version (31K): [in a new window] |
FIG. 5. Gene maps for three integrative elements and a prophage remnant identified in B171-8. The gene organizations and GC contents of three integrative elements and a prophage remnant are shown. The gene organizations of SpLE3 of O157 Sakai and KpLE2 of K-12 are also shown. The genes are colored according to their categories, as indicated in the box in Fig. 3.
|
The identification of 34 genes encoding T3SS effectors or effector homologues may be important for future studies of B171-8 pathogenicity. T3SS-expressing pathogens export a cocktail of effectors into the host cell, and various responses are induced in the host cell by these effectors. The 34 T3SS effectors or effector homologues of B171-8 are classified into 22 effector protein families. Intriguingly, all but Cif and OspB are also present in EHEC O157. We explored the new effectors by examining the genes of unknown function identified on the GEIs, but none were translocated into the host cells. In a homology search of the draft sequence of B171-8 now available in the GenBank database (accession no. AAJX00000000), we detected no effector homologue other than those identified in this study and some effector-like proteins that are also present in LEE-negative E. coli strains (46). Thus, it is very likely that EPEC B171-8 contains a significantly smaller number of T3SS effectors than O157 does. Intact genes for the effector proteins belonging to the NleC, NleD, EspK, EspX, EspN, EspO, EspR, and EspJ families identified in O157 (46) are not present in EPEC B171-8, although little is known about their roles in the pathogenesis of O157 (12, 34, 48, 52). On the other hand, Cif and an OspB homologue were found in EPEC B171-8 but not in O157. Cif triggers an irreversible cytopathic effect characterized by the inhibition of the cell cycle G2/M-phase transition and the progressive recruitment of focal adhesion plaques leading to the assembly of stress fibers (33, 36, 46). The function of OspB is still unclear, and the OspB homologue found in B171-8 is highly divergent from that of Shigella flexneri (42), but the B171-8 OspB homologue was translocated into HeLa cells from B171-8 (see Table S6 in the supplemental material). These findings indicate that not only the presence or absence of Shiga toxin genes but also the difference in the repertoire of T3SS effectors should be considered in examining the pathogenicity of O157 and B171-8, representing the EHEC and EPEC pathotypes, respectively.
Finally, we briefly mention a methodological aspect of this study. New sequencing technologies are currently becoming widely available, although they were not available when we planned and performed this study. These technologies are very powerful for resequencing bacterial genomes but produce only short sequence data. Therefore, resequencing of relatively large bacterial genomes (E. coli genomes, for instance), which usually contain substantial amounts of repeat sequences, by these technologies yields highly fragmented draft sequences. These data are very useful for identifying single nucleotide polymorphisms in the conserved chromosome backbone, but it is not easy to directly use systematic analysis of accessory chromosomal elements, which are strain specific (no reference sequence available) and often contain many repeated sequences. For example, Manning et al. recently pyrosequenced an O157:H7 strain isolated in the 2006 spinach outbreak and obtained 201 large and 680 small contigs (32). Thus, we believe that the approach we employed in this study is still of significant usefulness to systematically identify strain-specific large genomic regions and that combining one of the new sequencing technologies with WGPScanning and/or fosmid mapping would be the most powerful approach.
We thank Akemi Yoshida, Yumiko Takeshita, and Noriko Kanemaru for their technical assistance.
Published ahead of print on 29 August 2008. ![]()
Supplemental material for this article may be found at http://jb.asm.org/. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»