Previous Article | Next Article ![]()
Journal of Bacteriology, September 2008, p. 5832-5840, Vol. 190, No. 17
0021-9193/08/$08.00+0 doi:10.1128/JB.00480-08
Copyright © 2008, American Society for Microbiology. All Rights Reserved.
,
Laboratory for Foodborne Zoonoses, Public Health Agency of Canada, Guelph, Ontario N1G 3W4, Canada,1 Microbial Evolution Laboratory, National Food Safety and Toxicology Center, Michigan State University, East Lansing, Michigan 48824,2 The University of British Columbia, Michael Smith Laboratories, 301-2185 East Mall, Vancouver, British Columbia V6T 1Z4, Canada3
Received 8 April 2008/ Accepted 17 June 2008
|
|
|---|
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. Classification of VTEC serotypes into five seropathotype groupsa
|
43.4 kb long, contains 41 open reading frames which are organized in five polycistronic operons (LEE 1, LEE 2, LEE 3, LEE 5, and LEE 4) (34). Of particular interest in this study was LEE 5, which contains the eae gene, which encodes the outer membrane adhesin intimin (34). This operon also contains genes that encode the translocated intimin receptor known as Tir (27) or EspE (7) and the Tir chaperone, CesT (1, 10). OI-122 is a 23-kb pathogenicity island in O157:H7 strain EDL 933 which consists of three distinct modules separated by mobile genetic elements (Fig. 1) (24, 38). The first module encodes Z4321, a gene product with 46% homology to the phoP-activated gene C product (PagC) that enables survival in macrophages of Salmonella enterica serovar Typhimurium (30, 31, 36) (Fig. 1). This module is present in strains ranging from strains carrying a complete OI-122 with all three modules to incomplete strains carrying only this module. Module 2 carries the Z4326 (sen) gene, whose product is 39% homologous to Shigella enterotoxin, and genes encoding two proteins, Z4328 and Z4329, with 89 and 86% sequence homology to non-LEE-encoded (Nle) effectors of Citrobacter rodentium, NleB and NleE, respectively (8, 24). The third module encodes Z4332 and Z4333, which are enterohemorrhagic E. coli (EHEC) factors for adherence (Efa1 and Efa2).
![]() View larger version (13K): [in a new window] |
FIG. 1. Modular components of OI-122. ISA, insertion sequence-associated elements (or putative transposases) between the three modules. The PCR gene markers used to detect the presence of modules are indicated by bold type and blue.
|
MLST was chosen to shed light on the timelines for acquisition of these genomic islands in the evolutionary history of this pathogen by using inferred relationships among the collection of seropathotype strains. In this study, the VTEC strain group was analyzed using the MLST scheme developed at the STEC Center (http://www.shigatox.net/stec/mlst-new/index.html). Using this method, phylogenetic relationships were inferred based on differences among "core" genomes deduced from an analysis of seven highly conserved housekeeping genes (see Table S1 in the supplemental material). These carefully selected genes are presumably inherited vertically rather than horizontally and are subject to selective pressures, so there is slow, continual acquisition of random nucleotide changes (49). Studies have indicated that housekeeping genes diverge at a rate that reflects the overall rate of genome divergence due to vertical and horizontal transfer events, as well as genome reduction (29, 42, 49). The position of each node (strain) on a tree based on MLST data inherently reflects this genome diversity and provides a visual indication of how the genomes evolved relative to one another. The MLST technique was used to generate clustering patterns for 72 VTEC strains, which were then analyzed to determine correlations between clonal groups, seropathotypes, and genomic island content.
|
|
|---|
108 years ago (37), strain LT2 was selected for use as the outgroup (root) in the MLST tree.
![]() View larger version (59K): [in a new window] |
FIG. 2. Overlay of genomic island, clonal group, and seropathotype distributions relative to the MLST-inferred phylogenies (based on seven housekeeping genes) for 72 VTEC isolates. The box indicates the strains with a premature stop codon in a pagC-like gene, Z4321. The symbols indicate the presence of a complete OI-122 and LEE (red triangles), the presence of an incomplete OI-122 and LEE (filled circles), the presence of only OI-122 module 1 and the absence of LEE (open circles), and the absence of OI-122 and LEE (gray inverted triangles). n/a, gene was not analyzed by PCR; CG, clonal group; ST, sequence type; UPEC, uropathogenic E. coli; EPEC, enteropathogenic E. coli; ShEt, Shigella enterotoxin.
|
The amplicons were purified using a Qiagen PCR purification kit (Qiagen Inc., Mississauga, Canada) and were sequenced using a DYEnamic ET terminator cycle sequencing kit and a MegaBACE 500 automated DNA sequencer (Amersham Biosciences UK Ltd., Buckinghamshire, England). Sequencing of amplified fragments was done in both directions and in duplicate, so that for each gene a consensus sequence was derived from four sequence reads using Discovery Studio Gene software (Accelrys Software Inc., San Diego, CA). The gene consensus sequences were aligned using ClustalX (45). For each strain, the seven-gene consensus sequence was concatenated using Molecular Evolutionary Genetics Analysis (MEGA), version 3.1 (28), in the order aspC-clpX-fadD-icdA-lysP-mdh-uidA, giving an MLST "supergene" sequence that was 3,558 bp long (see Table S2 in the supplemental material).
Sequencing of the Z4321 (pagC-like) gene. The Z4321 locus was PCR amplified using forward primer 5'-ATGAGTGGTTCAAGACTGG-3' and reverse primer 5'-CCAACTCCAACAGTAAATCC-3', yielding a 521-bp amplicon (24). This amplicon was sequenced as described above, and the sequence data were cropped to 399 bp prior to ClustalW alignment with Discovery Studio Gene software.
Data analysis. The "supergene" sequence was used to generate a dendrogram using MEGA software (neighbor-joining algorithm with the Tajima-Nei model of genetic distance and bootstrapping of 1,000 replicates). The tree was subsequently examined for patterns in the genomic island contents of strains in the context of their seropathotype and clonal group designations.
Correlations were also made between the presence of complete and incomplete OI-122 and LEE in the context of the strains' propensity to cause disease. Genes spanning all three modules of OI-122 (Z4321, Z4326, Z4332, and Z4333) (Fig. 1) and the eae locus of the LEE (Z5110) were previously amplified for the seropathotype collection (24). The primers used for PCR amplification were described by Karmali et al. (24).
A protein tree of Z4321 was also made using MEGA (unweighted-pair group method using average linkages algorithm with bootstrapping of 1,000 replicates) to determine subgroups of strains based on amino acid differences. The Nei-Gojobori procedure (28) was also performed using MEGA to evaluate the substitution rates.
|
|
|---|
S. enterica serovar Typhimurium LT2 differed significantly at the genetic level from the VTEC isolates and other strains and so constitutes the outgroup node or root of the tree. S. flexneri strain 2a clustered together with uropathogenic E. coli strain CFT073 and two seropathotype D serotype O117:H7 strains. Reference strain K-12 clustered most closely with the other commensal strain, strain HS, and with low-virulence strains (seropathotype D and E strains and environmental isolates ECOR-01 and ECOR-04).
Sequencing of the pagC-like Z4321 gene. Overall, 11 single-nucleotide polymorphisms (SNPs) and one indel (insertion of adenine nucleotide) were found in the pagC-like gene of 43 VTEC strains positive for this locus (see Table S3 in the supplemental material). The results for three O145:NM strains and one O119:H25 strain had discrepancies with previous results for the pagC-like locus (24) due to the presence of weak PCR bands. Of greatest interest in the SNP analysis were the mutations that led to amino acid substitutions in the encoded protein. Figure 3 shows the interrelatedness of Z4321-postive strains and their groupings based on differences in protein sequence. Sequence analysis of the pagC-like Z4321 locus revealed a nonsense mutation in five strains, three seropathotype D strains (human serotype O103:H25 and O119:H25 strains) and two seropathotype E strains (bovine O98:H25 and O84:NM strains), as shown in Fig. 2. Strains with this decayed OI-122 module belong to a single MLST clonal group, group 20, with high tree branch reliability (100% bootstrap support). They are grouped together despite variations in host (human and bovine), serotype (somatic antigens O103, O119, and O98 with flagellar antigen H25 and O84:NM), seropathotype (seropathotypes D and E), and genomic island content (LEE positive with incomplete and complete forms of OI-122). The mutation in the Z4321 gene is the result of insertion of an A at nucleotide 388, which led to a shift in the reading frame and changed the last two codons of the protein product before the premature stop (Fig. 3). Thus, this outer membrane protein is truncated at the end of the third transmembrane loop and lacks the fourth and final loop of PagC. The nucleotide sequence downstream of the stop codon is conserved among the five strains with this mutation. Among the five strains with this truncated gene product are the only human seropathotype D strains in the VTEC collection in which both LEE and OI-122 module 1 are present. These less virulent (seropathotype D) human isolates have modules 1 and 2 but lacked module 3 (serotype O119:H25) or have a complete OI-122 (serotype O103:H25).
![]() View larger version (40K): [in a new window] |
FIG. 3. Phylogenetic tree based on PagC-like Z4321 protein in 43 VTEC strains. SNPs in Z4321 in S. enterica serovar Typhimurium were too numerous to display (TNTD). For an explanation of the symbols, see the legend to Fig. 2. n/a, gene was not analyzed.
|
There is a division among seropathotype C strains with flagellar H21 antigen (O91:H21 and 0104:H21 versus O113:H21) as a result of a mutation at nucleotide 119 (Fig. 3). O113:H21 strains encode Tyr (TCT) at this location, and this amino acid is unique to this group compared with all the other Z4321-positive VTEC strains. The O113:H21 strains make up STEC 2 clonal group 30, while the O91:H21 and O104:H21 strains, which encode Ser (TAT), constitute STEC 1 clonal groups 34 and 18, respectively.
G+C content. It was observed that the average G+C content of the housekeeping genes (range, 51.5 to 54.0% [see Table S2 in the supplemental material]) corresponds well with the overall host genome base composition (E. coli K-12 [accession no. NC_000913], 50.8%; EDL 933 [accession no. NC_002655], 50.4%). The average G+C content of the pagC-like Z4321 gene is 40.0%, which is low compared to that of the overall genome, as expected for a gene on a pathogenicity island (17).
Substitution rates. The Nei-Gojobori procedure (28) was performed using MEGA to evaluate the substitution rates for individual housekeeping genes and the concatenated supergene, as well as the pagC-like gene. More specifically, the numbers of synonymous (pS) or silent and nonsynonymous (pN) substitutions leading to differences in amino acid sequence per site were estimated for the housekeeping genes and the pagC-like Z4321 locus. The assumption is that the rates of evolution for a site are expected to be equal for neutral selection (pS/pN = 1), whereas positive (diversifying) selection occurs when pN > pS and negative (purifying) selection occurs when pS > pN (35). For each of the housekeeping genes and the concatenated supergene, the rate of synonymous mutation is higher than the rate of nonsynonymous mutation (pS> pN), which implies that there is purifying selection (Table 2; see Table S4 in the supplemental material), as expected. In the housekeeping genes, the rate of synonymous mutation is approximately 42-fold higher than the rate of nonsynonymous mutation (Table 2). The most divergent housekeeping gene (i.e., the least conserved gene) with the highest rate of nonsynonymous substitution is uidA (see Table S4 in the supplemental material). The rate of nonsynonymous substitution was 20-fold higher in pagC than in the housekeeping genes, which is consistent with a divergent gene; further, the rate of synonymous substitution was 1.6 times lower.
|
View this table: [in a new window] |
TABLE 2. Rates of synonymous and nonsynonymous substitution in Z4321 and in the MLST supergene sequence based on Nei-Gojobori and Jukes-Cantor analysis for the overall VTEC collection and for seropathotypes
|
|
|
|---|
E. coli O157:H7 strains associated with outbreaks and severe epidemicity have been shown to represent a single phylogenetic branch when they are grouped by MLST (using seven housekeeping genes), comprising 100% of the seropathotype A strains (40, 50, 51). It has been postulated that over the last 50 years, the pathogenic O157 lineage has evolved from an enteropathogenic E. coli O55:H7 group with the acquisition of verotoxin-converting phages and an O157 rfb (O-antigen subunit) gene cluster (3). This finding was corroborated in the current study, where all of the O157 strains clustered as a single group whose nearest neighbor was the enteropathogenic E. coli O55 strain (Fig. 2). The O157 group of strains and their O55 ancestor have either converged or diverged from non-O157 VTEC at some point in their evolutionary history. This separation may coincide with the evolutionary split of O157 and K-12, which occurred 4.5 million years ago (21). S. flexneri 2a, which mapped among the outliers from the major non-O157 cluster, is more closely related to K-12 than to EDL 933 (O157 seropathotype A) (21), and this was confirmed by the finding that these organisms share a more recent common ancestral node in the tree. The facts that uidA was found to be the least conserved of the housekeeping genes and that it was absent only in S. enterica serovar Typhimurium LT2, which is the outlier strain in the MLST tree, reaffirm the structure of the tree. There have been more evolutionary splits from Salmonella in the non-O157 clusters than in the O157 cluster.
The highly branching non-O157 group reflects a high degree of genetic rearrangement compared to the O157 cluster. It can be postulated that losing genetic factors and moving from virulent to less virulent may give new non-O157 variants a selective advantage in surviving and/or in contributing to pathogenesis during VTEC infection. Some of the internal non-O157 branches in the tree may represent the fastest-evolving strains (including seropathotype D and E strains with a nonsense mutation in pagC-like gene Z4321), and a lot of variation in genomic island content has been observed within these closely related subclusters. High substitution rates in non-O157 (seropathotype C, D, and E) strains corroborate this observation. Examination of the occurrence of the LEE and components of OI-122 in widely divergent MLST clonal groups has provided striking evidence of horizontal transfer of chromosomal genes and pathogenicity islands. Furthermore, this study, using OI-122 as an example, provides novel insights into the acquisition and fate of island components.
Evidence of horizontal gene transfer among non-O157 strains. The O157 EHEC 1 group represents the only VTEC group for which there is a direct correlation between seropathotype (seropathotype A) and genomic island content (LEE positive with complete OI-122). Otherwise, among the non-O157 VTEC strains, it is clear from the MLST clustering patterns that seropathotype and genomic island distribution are not clonally restricted. In fact, seropathotypes are widely dispersed throughout the tree and are more widely dispersed with decreasing level of epidemicity (seropathotype A clusters on one branch, seropathotype B clusters on four branches, seropathotype C clusters on five branches, seropathotype D clusters on nine branches, and seropathotype E clusters on nine branches [Fig. 2]). LEE and the various OI-122 forms (complete, incomplete, or absent) are widely distributed in different lineages (clonal groups) and seropathotypes. Modular components of OI-122 that are variably present or absent on a branch containing closely related strains likely were obtained from less closely related strains or species via horizontal gene transfer. For example, examination of the MLST tree around seropathotype C, D, and E strains shows that there is a mixture of LEE-positive and -negative strains in a branch. This scattered distribution of eae genes in the MLST tree is characteristic of a horizontal transfer event when a set of genes has been introduced into a lineage (9). Among seropathotype B through E strains, OI-122 molecules and their gene content are similarly scattered throughout the tree. For example, in branches where all strains belong to the same serotype and have the same incomplete form of OI-122, it is likely that components were acquired in a modular manner and became stabilized in the genome. The observations for LEE and OI-122 described above support the notion that the virulence genes comprising these gene cassettes have been and continue to be horizontally transferred across lineages. The wide MLST clonal distribution of these two islands and the lack of association between seropathotype, genomic island content, OI-122 module content, and MLST clustering patterns are also indicative of horizontal transfer among strains (Fig. 2).
pagC-like Z4321 deletion mutants in less virulent strains. The evidence of decay of OI-122 elements in module 1 (Z4321) in five strains that belong to seropathotypes D and E correlates with the apparent reduced virulence of these LEE-positive strains. It also strongly indicates that there has been horizontal gene transfer since this mutated genetic element is shared by strains whose seropathotype and genomic island profiles and hosts differ but the strains are closely related phylogenetically (15). The sequence flanking the indel (an adenine insertion), particularly downstream of the premature stop codon, which no longer encodes a functional protein product, is conserved in these different strains. Based on the minimal assumption of evolution, this insertion was introduced once at some point in the evolutionary history of this collection and was passed horizontally among the strains (15). The OI-122 modular patterns may reflect horizontal acquisition of one or more modules independently or modular decay following transfer of a complete OI-122 (Fig. 3). A correspondingly high rate of synonymous and nonsynonymous (detrimental) substitutions in the Z4321 gene in these less virulent seropathotypes is also consistent with a decaying or inactive gene. Functional protein studies with Yersinia and Salmonella involving closely related Ail and Rck proteins indicated that the fourth extracellular loop (absent in the truncated Z4321 protein) is not associated with adhesion, invasion, or serum resistance phenotypes (5, 32, 36). The third loop (full length in the truncated Z4321 protein) has been shown to confer virulence properties in Rck (5). In the future, functional assays may be performed to assess the impact of this mutation on the Z4321 protein in VTEC. Wickham et al. showed that there is a significant association between the presence of a combination of pagC and sen (ent), nleB, and efa-1/lifA and HUS after infection in non-O157 E. coli (46). On its own, the pagC-like gene is associated with HUS but not with outbreaks (46). It is interesting that the seropathotype E strains in this study which had the mutated pagC gene were of bovine origin and also did not contain efa-1 (Fig. 2). It has been proposed that the additive effect of these two genes contributes significantly to causing HUS (6, 46). While the pagC locus may contribute to pathogenesis in more virulent VTEC, pseudogenization may have hampered its activity and given rise to these less virulent variants. The observation that human strains without this deleterious mutation in pagC (on a complete OI-122 with LEE present) are seropathotype A, B, or C strains and strains with the truncated gene (on a complete or incomplete OI-122 with LEE) are seropathotype D strains supports this theory.
SNPs in Z4321 useful for differentiation of O113:H21 and the presence of LEE.
Human seropathotype C strains with flagellar H21 antigen were originally classified in one clonal group (http://www.shigatox.net/cgi-bin/stec/clonal). Data from this study show that these strains belong to different clonal groups (Fig. 2), and this was corroborated by a SNP in Z4321 (A
C at bp 119) that results in an O113:H21-specific Tyr residue (in clonal group 30) instead of Ser, which is found in all the other serotypes tested, including O91:H21 (clonal group 34) and O104:H21 (clonal group 18). While they share a common flagellar H21 antigen and have the same OI-122 and LEE profiles (LEE negative and OI-122 module 1 only [Fig. 2]), these groups split at some point in their evolutionary history. There may be other genomic differences between these groups, but targeting this SNP may be a quick way to differentiate between the H21 clonal groups and to screen for the O113:H21 serotype.
A second nonsynonymous SNP at nucleotide 207 of the Z4321 gene allowed us to predict the presence of LEE because strains having His at the corresponding position harbor eae, while strains with Gln lack eae. An interesting observation is that only the strains that carry the pagC-like gene exclusively (modules 2 and 3 are not present) both have this nonsynonymous substitution (His
Gln) and lack LEE. Strains with this property include O113:H21 strain CL3, in which Z4321 is part of a mosaic island cointegrated with OI-48 (41). Screening for a marker, Z1640::S1, that is indicative of this hybrid island indicated that other serotypes in the VTEC collection with this characteristic include O156:NM, O171:H2, O7:H4, O88:H25, and O91:H21 (41). It is unclear whether the pagC-like gene first appeared in strains such as O157:H7 strain EDL 933 as part of a complete OI-122 or as part of the OI-122::OI-48 mosaic island, as observed in O113:H21 strain CL3. The pagC alleles (alleles 1 and 4) of LEE-positive strains have four to eight nucleotide differences compared with the allelic variants (alleles 2 and 3) of the LEE-negative lineages (see Table S3 in the supplemental material). The pagC-like alleles may have been exchanged between the LEE-positive and -negative lineages at some point, along with the acquisition of SNPs. Given that there are more than a few nucleotide differences between these genes, the possibility that the genes may also have arisen from a separate ancestor cannot be overlooked.
Concluding remarks. Bacterial evolution is driven by the need to achieve optimal "fitness," a concept that refers to attributes that enhance the survival, spread, and/or transmission of an organism within a specific ecological niche (16, 39). Horizontal gene transfer and gene degradation provide mechanisms for rapid adaptation to changing ecological circumstances or for acquiring optimal fitness so that an organism can survive and flourish under such circumstances (16). The evolutionary advantage of acquiring genomic islands over acquiring smaller genetic elements is that a large number of genes encoding many complementary functions may be transferred en bloc to the recipient organism, a process that may result in "evolution in quantum leaps" (14); one example of this is the acquisition of a type III secretion system which is encoded by LEE (17). On the other hand, a minor environmental change may not require acquisition of all the genetic material present in a genomic island, and the transfer of smaller elements, such as plasmids or transposons, may be more efficient. Considering that we did not explore the LEE in this study to the same extent as OI-122, further analysis should shed more light on the interplay of these genomic islands. OI-122 has three modules, each consisting of genes associated with mobile genetic elements, including transposase genes. One or more of these elements may thus be transposons, a concept supported by the occurrence of one, two, or three OI-122 modules in individual strains. Transposons are typically associated with the transfer of antimicrobial resistance genes under the selective pressure of antibiotics. However, transposons containing genes that encode catabolic functions have also been described, and their presence may be selected by specific substrates (43). Environmental selective factors that could be involved in selecting specific OI-122 modules remain to be investigated. Knowledge about the ecological determinants of the presence, absence, or decay of specific OI-122 modules could provide new insights into the origin of pathogenic clones expressing specific modular patterns.
This population genetics study provided new insights about two genomic islands in the evolution of pathogenic VTEC. The results support the hypothesis that genomic islands in VTEC are horizontally acquired and that some of them, like OI-122, are likely acquired in a modular manner. It appears that the less virulent VTEC strains have experienced a loss of genomic island components. Further work can address the question of what role the horizontally acquired islands play in the emergence of new pathogens.
This research was supported by the Public Health Agency of Canada, as well as by a Canadian Institutes of Health Research Food and Water Safety grant and by operating grants from the Canadian Institutes of Health Research and the Howard Hughes Medical Institute.
Published ahead of print on 27 June 2008. ![]()
Supplemental material for this article may be found at http://jb.asm.org/. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»