Previous Article | Next Article ![]()
Journal of Bacteriology, June 2004, p. 3547-3560, Vol. 186, No. 11
0021-9193/04/$08.00+0 DOI: 10.1128/JB.186.11.3547-3560.2004
Copyright © 2004, American Society for Microbiology. All Rights Reserved.
Bacterial Pathogenesis and Genomics Unit, Division of Immunity and Infection, Medical School, University of Birmingham, Birmingham B15 2TT, United Kingdom,1 Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri 631102
Received 19 September 2003/ Accepted 12 February 2004
|
|
|---|
|
|
|---|
Initial studies of UPEC, and later of other pathotypes, suggested that E. coli strains often acquire new complex pathogenic phenotypes in a single step by the acquisition of pathogenicity islands, which contain virulence genes clustered on the chromosome and which are acquired en bloc by horizontal gene transfer (21). Similar studies with the related bacterium Salmonella enterica have delineated several Salmonella pathogenicity islands (Spi-1, Spi-2, Spi-3, etc.) (2, 22). The horizontal transfer of DNA by mobile elements such as bacteriophages and plasmids is also known to play a role in the evolution of virulence in E. coli and Shigella (15). However, if bacterial genomes are subject to a continual ingress of novel DNA through horizontal gene transfer, the question is raised of how this DNA is removed from the genome should it cease to provide any selective advantage (33). Furthermore, some authors have questioned the utility of the pathogenicity island concept when in some cases the position, order, and clustering of virulence genes seem remarkably fluid (61).
Many strains of EHEC and EPEC, like S. enterica, utilize type III secretion to subvert eukaryotic signaling pathways by injecting bacterial effector proteins into the host cell cytoplasm (27, 30). Within these pathotypes of E. coli, a well-characterized type III secretion system (TTSS), similar to the Spi-2 system of S. enterica and encoded by a pathogenicity island termed the locus of enterocyte effacement (LEE), is responsible for the development of the attaching-effacing lesion and for other effects on enterocyte functions (21, 30, 37, 38, 48).
An analysis of the complete genome sequences of two strains of EHEC O157:H7 revealed genes that potentially encode a second cryptic TTSS, which has been termed ETT2 (for E. coli TTSS 2, with the term ETT1 reserved for the LEE-encoded TTSS) and which resembles the SPI-1 TTSS from S. enterica (24, 49). Three recent studies have shown that the ETT2 gene cluster is found in some pathogenic strains of E. coli in addition to O157:H7 (23, 35, 40). However, these studies failed to agree on the boundaries of the cluster, were limited in terms of the phylogenetic diversity of the strains they sampled (leading to the erroneous conclusion that ETT2 is largely absent from nonpathogenic strains), and did not describe any ETT2-associated chaperone or translocator genes.
Prompted by the discovery of ETT2, we wished to address several interrelated questions: how widespread is the ETT2 gene cluster among E. coli isolates, how has this pathogenicity island evolved, and where are the ETT2 chaperones and translocators? In pursuit of these goals, we performed in silico analyses of ETT2 gene clusters available from genome sequences and other sources and developed a PCR-based approach, called tiling-path PCR scanning (TP-PCR), that allowed us to construct a complete tiling path through a 40-kb fragment of the chromosome centered on the ETT2 gene cluster for 79 well-validated and phylogenetically diverse strains drawn from the ECOR collection and representatives of selected pathotypes. We were surprised to discover that the ETT2 gene cluster is present in whole or in part in the majority of E. coli strains, but that in almost all cases, including that of EHEC O157:H7, it has been subjected to varying degrees of mutational attrition. In addition, we found a second type III secretion-associated locus (eip) in some E. coli strains which we predict encodes the ETT2 translocation apparatus.
|
|
|---|
Each coding sequence annotated in the ETT2 cluster in the published E. coli O157:H7 Sakai genome sequence (24) was analyzed by BLAST searches on the coliBASE server, supplemented as needed by visualization of the genomic context of homologues, by G+C percentage plots using an Artemis applet (53) within coliBASE, by multiple alignments on the EBI Clustal server (http://www.ebi.ac.uk/clustal), by searches of the National Center for Biotechnology Information's conserved domain database (http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml), and by PSI-BLAST searches on the ViruloGenome server (http://www.vge.ac.uk). Any coding sequences in the Spi-1 TTSS that were not represented in the ETT2 cluster were subjected to similar analyses, as were those in the newly discovered eip gene cluster.
The length of each predicted coding sequence in the ETT2 gene cluster from each strain was compared with the lengths of homologous predicted coding sequences in every other E. coli strain and in the Spi-1 system. When two coding sequences in one strain were represented by a single longer sequence in another strain or when a single coding sequence in one strain was judged to be substantially longer than those in another (allowing for the fact that GLIMMER sometimes makes mispredictions of start sites or even of coding sequences), the longer sequence was assumed to represent the physiologically active ancestral state, while the shorter sequences were judged to be pseudogenes.
Phylogenetic analysis. Homologues of EivC (from the Sakai strain) and EicA (from EAEC 042) were identified by a PSI-BLAST search of the NCBI nonredundant protein database supplemented with predicted protein products from unfinished genome sequences on the ViruloGenome web site (http://www.vge.ac.uk). These protein sequences were aligned with ClustalW, version 1.8 (58), with minor manual adjustments done with SeaView (20). All positions with gaps were removed from the alignment, and phylogenetic trees were generated by the neighbor-joining algorithm (54), as implemented in ClustalW. The topology of the EivC tree was assessed by using 1,000 bootstrap replicates.
Bacterial strains. Details of the bacterial strains used for this study are provided in Table 1. The E. coli reference (ECOR) strain collection was kindly supplied by Thomas Whittam and has been described elsewhere (44; http://foodsafe.msu.edu/whittam/ECOR). Representatives of other pathotypes, including neonatal meningitic E. coli strain RS218, EAEC strain 042, enterotoxigenic E. coli (ETEC) strain H10407, EAEC strain EAEC25, UPEC strain CFT073, and E. coli strain K-12, were kindly provided by Ian Henderson (University of Birmingham), while an isogenic nontoxigenic derivative of the E. coli O157:H7 Sakai strain was a kind gift from Chihiro Sasakawa (University of Tokyo). Two sets of clinical isolates were obtained in early 2003 from the clinical microbiology laboratory of the Queen Elizabeth Hospital, Birmingham, United Kingdom: they were a series of 43 consecutive blood culture isolates of E. coli and a series of 36 putative commensal strains obtained by subculturing E. coli isolates present in less than significant numbers (<107/ml) from urine samples without pyuria or any other evidence of infection (i.e., presumed to be perineal contaminants).
|
View this table: [in a new window] |
TABLE 1. Distribution of ETT2 island PCR fragments among E. coli strains
|
5-kb fragments that spanned the region of interest (Fig. 1 and 2) but that also had short overlaps of a few hundred base pairs. Long PCRs were performed to survey the region in question. Any negative results by long PCR for a given primer pair were followed up by amplification of the relevant short overlaps with the same primers and/or a deletion-scanning long PCR that employed primers flanking the lost segments (Fig. 1). For two strains for which there is published evidence of the precise site of an insertion or deletion (an 8.7-kb deletion in the EPEC2 B171-8 sequence or an unoccupied yqeG-glyU intergenic region in the CFT073 sequence), a short PCR centered on the insertion-deletion (in-del) was employed to screen other strains in our collection. In a few cases for which these approaches failed to produce coherent results, the original long PCRs and/or the deletion-scanning PCRs were repeated with a different enzyme preparation. In some supplementary experiments, for which we decided that a complete tiling path was not required (the surveys for ETT2 in clinical isolates and the surveys of the eip cluster), rapid short PCR scans using TP-PCR primer sets were performed.
![]() View larger version (31K): [in a new window] |
FIG. 1. TP-PCR.
|
![]() View larger version (34K): [in a new window] |
FIG. 2. ETT2 and eip structures. (a) Structure of the ETT2 pathogenicity island in a number of E. coli and Shigella strains and comparisons with regions of Spi-1 and Spi-3 from S. enterica serovar Typhimurium. Homologous genes are vertically aligned. Insertions relative to the complete ETT2 sequence (as seen in strains Sakai and EAEC 042) are indicated with dashed lines. Dotted lines indicate deletions. (b) Structure of the eip island in EAEC 042 and comparison with the backbone sequence seen in Sakai (and other sequenced E. coli strains). Numbered brackets in both sections indicate the positions of primer pairs used for long PCR (see the text for details).
|
|
View this table: [in a new window] |
TABLE 2. Primers used to detect ETT2 and eip gene clusters
|
|
|
|---|
Several genes that are homologous to components of TTSS gene clusters were found upstream of ECs3714 in the Sakai strain (Table 3): they are ECs3713 (a pseudogene, homologous to part of orgB from Spi-1 [19% identity at the protein level over a 101-amino-acid stretch]), ECs 3712/ygeK (encodes a homologue of the Spi-2 regulator SsrB [32% identity at the protein level over a 209-amino-acid stretch]), ECs3711/B2854 (a homologue of iagB from Spi-1 [43% identity at the protein level over a 122-amino-acid stretch]), ECs3709/ygeH (encodes a tetratricopeptide repeat TTSS regulator, homologous to hilA in Spi-1 [29% identity at the protein level over a 403-amino-acid stretch] [46]), and ECs3708/ygeG (encodes a tetratricopeptide repeat TTSS chaperone, homologous to sicA in Spi-1 [37% identity at the protein level over a 140-amino-acid stretch] [46]). These findings provide strong evidence that the ETT2 cluster extends upstream of Ec3714 and negate the claim (35) that the ETT2 gene cluster lacks chaperones (Table 3). Furthermore, they imply that the essential difference between EHEC and K-12 at this locus is a deletion in K-12 rather than an insertion into the EHEC genome, as claimed by Makino et al. (35), and that the originally commensal and now laboratory-adapted strain K-12 retains a remnant of an apparently virulence-associated type III secretion cluster.
|
View this table: [in a new window] |
TABLE 3. Genes within the ETT2 gene cluster
|
Curiously, three genes from the leftmost extremity of the ETT2 pathogenicity island are homologues of uncharacterized genes from the Spi-3 pathogenicity island from S. enterica: yqeH (ECs3703) is homologous to rmbA (39% identity at the protein level over a 190-amino-acid stretch), yqeI (ECs3704) is homologous to marT (41% identity at the protein level over a 101-amino-acid stretch), and yqeJ (ECs3705) is homologous to fidL (43% identity at the protein level over a 139-amino-acid stretch). Furthermore, ygeK (ECs3712) encodes a homologue of the Spi-2 regulator SsrB (32% identity at the protein level over a 209-amino-acid stretch). Others have noticed the presence of Spi-1- and Spi-3-related genes in E. coli K-12 (2, 27), but their context as remnants of the ETT2 island only becomes clear through comparisons with E. coli O157 and other pathotypes (Fig. 2a).
A comparison of the ETT2 island with Spi-1 revealed a generally similar gene complement and arrangement, with a few notable differences (Fig. 2a; Table 3). For example, there were some additional putative transcriptional regulator genes, such as ECs3720, a lack of gene products similar to the Spi-1 secreted proteins AvrA, SptP, and SipABCD (despite the presence of the sicA-like chaperone gene ygeG), and an absence in the O157 annotation of an invH homologue, although a potential invH homologue or pseudogene can be discerned in the region between ECs3720 and EprS and has been annotated as orf7 by Makino et al. (35) for strain B171-8. However, the most striking finding was that three TTSS structural genes, the homologues of spaR, prgH, and orgB, were disrupted by frameshift mutations in both EHEC O157:H7 genome sequences (Fig. 2a; Table 3). Since any one of these mutations would, by analogy with Spi-1, abolish type III secretion (8, 31), we conclude that ETT2 is incomplete and inactive as a TTSS in EHEC O157:H7, despite claims to the contrary (35).
The alignment of EivC from ETT2 with its homologues from other TTSSs allowed the construction of a phylogenetic tree (Fig. 3a) which indicated that ETT2 belongs to the Spi-1/Mxi-Spa group of TTSSs, as do the recently discovered TTSSs from Chromobacterium violaceum, two insect endosymbionts, and Yersinia enterocolitica (11, 12, 19, 59).
![]() View larger version (37K): [in a new window] |
FIG. 3. Phylogenetic trees showing relationship of ETT2 to other TTSSs based on neighbor-joining analysis of EivC and its homologues (a) and of EicA and YgeG TPR chaperones to each other and to other chaperones (b). The numbers on the branches indicate the percentages of bootstrap support based on 1,000 replicates. Numbers in parentheses are GI numbers of published sequences.
|
In contrast to EAEC O42, all other genome-sequenced Escherichia or Shigella pathotypes possessed either an incomplete ETT2 gene cluster or, as noted for the EPEC1 and UPEC strains, none at all. We noted an important distinction between the EPEC1 genome-sequenced strain E2348/69, which lacks the ETT2 island entirely, and EPEC2 strain B171-8, which contains an ETT2 cluster with an 8.7-kb deletion (35) centered on a 7-bp repeat (CC/ATCATT) (Fig. 2a). This finding conflicts with the report by Makino et al. (35) that E2348/69 and B171-8 both contain similar sets of ETT2 genes, but it is concordant with the known highly divergent origins of the EPEC1 and EPEC2 clades (17). Three additional patterns of deletion were noted for the ETT2 gene cluster. An identical 14.6-kb deletion, which removed almost all of the secretion apparatus genes but left some TTSS regulators and chaperones, was found in the two K-12 laboratory strains (Fig. 2a). A similar but slightly larger 18.3-kb deletion was seen for S. sonnei, while a 26.5-kb deletion resulting in an almost total loss of the gene cluster was seen for the two S. flexneri sequences, in which all that remained were fragments of the two genes at the extremities of the island. We could not find the yqeG and glyU genes in the most recent release of the S. dysenteriae M131649 genome sequence, which suggests that this strain has undergone such extensive deletions in this region of the chromosome that it is impossible to determine whether its lineage ever possessed ETT2. Additional frameshift mutations were present in some pathotypes (Fig. 2a; Table 3).
The observed spectrum of ETT2 genotypes could be explained either by a single insertion with subsequent gene loss, or far less plausibly, by assembly of the largest ETT2 gene clusters from the smaller clusters by gene acquisition. Several lines of evidence lead us to strongly favor the first scenario. (i) The indels are centered on structural genes, which form a functionally coherent unit, showing a large degree of conservation in gene order with Spi-1 and other TTSSs. It is highly unlikely that the same gene order would arise independently by gene acquisition within E. coli. (ii) In the smaller clusters, the indel boundaries are often marked by truncated genes, as judged by homology with Spi-1 (e.g., b2859 in K-12 is a truncation of ECs3715), suggesting that deletion occurred rather than insertion. (iii) Insertion sequences of several classes (IS1, IS2, and IS3) are found at the sites of indels in ETT2 gene clusters, suggesting that homologous recombination between such elements may account for multigene deletions. Similarly, homologous recombination between the two copies of the 7-bp repeat might account for the 8.7-kb deletion in B171-8 (Fig. 2a).
The ETT2 locus is present in whole or in part in the A, B1, D, and E groups of the ECOR commensal strains but was acquired after the divergence of the B2 group. We wished to survey the phylogenetic distribution of the ETT2 pathogenicity island. However, unlike previous surveys, which sampled discontinuous fragments of the island from pathogenic E. coli strains allied to O157, we attempted (i) to sample the full range of phylogenetic diversity within the species E. coli, including commensal strains, and (ii) to determine the complete tiling path through this region of the chromosome for a large number of strains. We surveyed the well-characterized ECOR strain collection, which is richly diverse in terms of phylogeny and geographical, clinical, and zoological strain origins (Table 1) (44). We devised a method, TP-PCR (Fig. 1), that exploits short- and long-PCR protocols to construct a complete tiling path through the relevant chromosomal region. This method allowed us to construct a complete tiling path through the ETT2 gene cluster for 68 of 72 ECOR strains and for all of the representative pathotype strains (Table 1).
Using TP-PCR, we discovered that the ETT2 gene cluster was present in whole or in part in the majority of the E. coli strains sampled (50 of 72 [
69% of the ECOR collection]), whether they were commensal or pathogenic (Table 1; Fig. 4). There was no evidence of large insertions or rearrangements within this cluster. Furthermore, the ETT2 gene cluster always occurred at the same chromosomal location, within the yqeG-glyU intergenic region, unlike the LEE, which can be inserted into selC, pheV, or pheU (17, 57), adding weight to our conclusion that the island entered an ancestral E. coli strain once and then was lost through mutational attrition in most strains.
![]() View larger version (36K): [in a new window] |
FIG. 4. Illustrative PCR results for ETT2 gene cluster. Strains, from top to bottom: O157 Sakai strain (complete ETT2 gene set), ECOR1 (a B171-8-like strain with an 8.7-kb deletion), K-12 (14.6-kb deletion), and CFT073 (UPEC, with no ETT2). Lanes, from left to right: molecular weight markers, 5-kb amplicons obtained with ETT2 TP-PCR primer pairs 1 to 10 (see text for details), negative control (DNA, no primers), and molecular weight markers (HyperLadder I; Bioline).
|
![]() View larger version (18K): [in a new window] |
FIG. 5. TP-PCR results superimposed on phylogenetic structure of E. coli. The tree was obtained by neighbor-joining analysis of the ECOR MLEE data (available at http://foodsafe.msu.edu/whittam/ecor) using the program Neighbor, part of the PHYLIP package (J. Felsenstein; available from http://evolution.genetics.washington.edu/phylip.html). Branches containing one of the three most common genotypes are indicated by bold, dashed, or gray lines. Filled circles indicate strains with eip clusters.
|
![]() View larger version (33K): [in a new window] |
FIG. 6. Indel-specific short PCRs. The first two rows show the results for ECOR strains from 200-bp PCRs across the deletion seen in strain B171-8. The second two rows show the results for ECOR strains from 600-bp PCRs across the ETT2 insertion site (see Table 1 for details). Positive results are labeled with ECOR strain numbers or pathotypes.
|
Overall, the most common TP-PCR pattern was that corresponding to the 8.7-kb deletion seen in the already sequenced ETT2 cluster from EPEC2 strain B171-8 (35). This TP-PCR pattern predominated in the A and B1 ECOR groups. We confirmed that the deletion was identical (to within a few base pairs) to that in B171-8 in 17 of 25 ECOR group A strains, 12 of 16 ECOR group B1 strains, and the pathogenic strains H10407 (ETEC) and EAEC25 (EAEC) by performing a short 200-bp PCR across the deletion site (Table 1; Fig. 6).
To confirm that the high prevalence of ETT2 gene fragments among the strains of the ECOR collection was a general phenomenon of human E. coli isolates, we performed short PCRs with the ETT2 TP-PCR primers and collections of 43 freshly collected local blood culture isolates of E. coli and 36 freshly collected local urine contaminants from patients with no laboratory evidence of urinary tract infections (presumptive commensal strains of E. coli). ETT2 gene fragments were detected in 16 of 43 (37%) bloodstream isolates and 10 of 36 (28%) commensal isolates (data not shown), confirming the high prevalence of ETT2 gene fragments in E. coli (albeit a lower one than in the ECOR collection) and the lack of any obvious link to virulence.
For comparison, we surveyed the ECOR collection for fragments of the LEE by using short PCR. Only two strains were positive for LEE fragments, namely ECOR25, an ECOR group A strain from a healthy dog (four of five fragments were positive), and ECOR37, an ECOR group E strain from a healthy marmoset (all five fragments were positive) (data not shown).
A second type III secretion locus, the eip locus, encodes homologues of Spi-1 translocators and additional TTSS-related proteins in a minority of E. coli strains. Prompted by the lack of any sipABCD homologues in the ETT2 pathogenicity island, we searched the available Escherichia and Shigella genomes for similar genes. We discovered a novel 20.9-kb pathogenicity islandwhich we have termed the eip islandwithin the EAEC O42 genome. This island is inserted between the E. coli backbone genes yicM and nlpA and contains two homologues of sip genes that we have termed eipB (encodes a protein that shows 20% identity over a 527-amino-acid stretch with SipB) and eipD (encodes a product that shows 31% identity over a 266-amino-acid stretch with SipD) (Fig. 2b). Between these two eip genes lies a third gene (which we have termed eipX) that shows weak similarity to espD (the encoded product shows 20% identity over a 282-amino-acid stretch with EspD) and thus may encode an additional secreted translocator protein. In addition, the eip island contains genes for a novel SicA-like tetratricopeptide repeat chaperone (eicA), a novel HilA-like regulator (eilA) (46), and an invasin/intimin-like large outer membrane protein (eaeX) (Fig. 2b). Phylogenetic analysis of the tetratricopeptide repeat chaperones from O42 suggested that the chaperones from the eip and ETT2 gene clusters arose from a duplication after the ETT2 system diverged from Spi-1 (Fig. 3b).
A set of short PCRs, targeting fragments that were evenly spaced throughout the eip island, were applied to the ECOR collection and to our collections of clinical isolates (Fig. 2b). The distribution of the eip island mapped onto the phylogenetic structure of the ECOR collection and correlated well with the distribution of the most intact ETT2 clusters: the 13 eip-positive ECOR strains encompassed 11 of the 12 D strains and 2 of the 5 E strains (Fig. 5; Table 1). Significantly, all six ECOR strains that showed an intact Sakai-like ETT2 genotype possessed the eip cluster. This suggests that the Sakai and EDL933 strains are unusual in harboring an intact ETT2 cluster without an accompanying eip island. Among clinical isolates, the eip island was present in 7 of 15 ETT2-positive bloodstream isolates and 6 of 10 ETT2-positive commensal isolates but was never present in any isolates that lacked genes from the ETT2 cluster (data not shown). Furthermore, like the ETT2 gene cluster, it did not appear to be significantly more common in bloodstream isolates than in commensals (7 of 43 versus 6 of 36 isolates) (data not shown).
|
|
|---|
The second surprise was that in most strains, the ETT2 gene cluster has undergone mutational attrition, so that it can no longer encode a functioning TTSS. In most strains, genes have been deleted, with one particular deletion, that already described for EPEC2 strain B171-8 (35), being common to most strains in the ECOR A and B1 groups. However, even in the O157 genomes for which it was first described (24, 49), the ETT2 gene cluster contains numerous inactivating mutations which, by analogy with Spi-1, must abolish its functions. Thus, in contrast to a previous claim (35) that the ETT2 genes from the O157 Sakai strain encode a functioning TTSS that can mediate the secretion of EspB (now considered doubtful [T. Tobe, personal communication]), we conclude that ETT2 cannot function as a secretion system in either of the two genome-sequenced E. coli O157 strains. Instead, here, as in most strains, it represents a "rudimentary, atrophied or aborted organ" of the sort predicted to occur by Darwin as a consequence of the theory of evolution (13).
Several salient messages arise from the observation of ETT2 gene fragments in the laboratory model and nonpathogenic strain K-12. Firstly, despite the fact that K-12 has one of the smallest E. coli genomes (43), it is not safe to conclude that it represents the ancestral state for the speciesin other words, apparent insertions relative to K-12 in other strains may in fact represent deletions in K-12 (47). Examples of this phenomenon other than ETT2 include O-island 82, of which K-12 has lost most of the gene for an iron-regulated outer membrane protein (1), and the mbhA and fhiA genes from K-12 which represent the residual boundaries of an ancient lateral flagellar gene cluster present in the ancestral E. coli (M. J. Pallen, unpublished data). Thus, as others have noted (47), the polarity of such intergenomic changes can only be inferred safely after comparisons with several diverse genomes.
Secondly, not every apparent coding sequence in K-12 should be expected to encode a functioning protein: as others have noted, many genes, particularly orphan open reading frames, may represent pseudogenes (26, 39), a fact that is likely to frustrate attempts for a global genome-wide functional characterization of all open reading frames of K-12 (41). Within the K-12 ETT2 cluster, we can confidently dismiss yqeJ, ygeF, ygeI, b2854, ygeK, b2858, b2859, b2863, and b2864 as pseudogenes. These observations on indel polarity and pseudogenes together imply that the accurate annotation of any E. coli genome, including that of K-12, depends on multiple strain comparisons. This is a compelling reason to support genome sequencing efforts with additional strains and is a point that is strongly made by the fact that only 1 of 12 completely or nearly completely sequenced Escherichia and Shigella genomes possesses an apparently intact ETT2. Similarly, this study emphasizes that it is necessary in laboratory-based strain comparisons to sample a wide range of phylogenetic diversity and to construct a complete tiling path through a region of interest in almost all strains before safe conclusions can be drawn about associations between the genotype of a gene cluster and virulence or other phenotypes.
The function of ETT2 remains a mystery, as do the niche and signals that might trigger its expression and the identity and nature of its effectors. Its presence in commensal strains suggests that it might now play or once played a role in symbiotic colonization rather than in the pathogenesis of human or mammalian disease, as has been suggested for some other TTSSs (11, 12, 18, 36), including Spi-1 (42). Alternatively, its target may not be the mammalian gut at all, but instead it may aid in survival in the struggle with microscopic eukaryotes in the external environment (9, 32). The discovery of an apparently intact ETT2 secretion system in EAEC O42 raises the hope that studies of this cryptic TTSS might mirror the successes of recent studies of a cryptic type II secretion system that was first discovered in the genome sequence of E. coli K-12 (19a).
We do not yet have any functional data to link the newly found eip cluster to the ETT2 cluster. However, the fact that this cluster is only ever found in ECOR strains that possess an apparently complete ETT2 gene complement, taken together with the homology to Spi-1 and the clear functional relationship between the type III secretion and translocation machinery in other TTSSs, makes this a compelling hypothesis. Furthermore, the configuration of the ETT2 and eip gene clusters and the identities of the genes carried within them shed light on the evolution of pathogenicity islands in general and TTSSs in particular and raise some interesting questions. The separation of translocon genes from genes encoding the needle complex is unusual and is shared only with chlamydial TTSSs (55). However, given the close similarities of ETT2 to the other TTSSs in the proteobacteria, it seems unlikely that this separation of genes represents the ancestral state for the Spi-1-like systems; instead, it appears to be a specific derived feature of this system. Intriguingly, the ETT2 gene cluster encodes a tetratricopeptide repeat chaperone, which would normally be expected to bind to a translocator (46); however, the ETT2 cluster lacks translocator genes, while the eip cluster encodes its own tetratricopeptide repeat chaperone, which presumably binds to the EipBXD proteins. Similarly, there are two hilA homologues: one in the ETT2 cluster and one in the eip cluster. This curious redundancy, together with phylogenetic data on the chaperones, suggests that the ETT2 and eip clusters were once part of a larger cluster that underwent fission, accompanied by the duplication of chaperones and regulators, after its divergence from its common ancestor with Spi-1.
The ETT2 cluster simultaneously provides a model of gene flux and mobility on the one hand and a model of genetic stasis and loss on the other. The shuffling of homologues of genes from three distinct Salmonella pathogenicity islands (Spi-1, Spi-2, and Spi-3) into one cluster in E. coli represents pathogenicity genes in motion. However, a single insertion followed by mutational attrition provides a model for how genes are lost from a genome once they no longer provide any selective advantage (33): frameshift mutations are followed by gene deletions and the arrival of insertion sequences that may then catalyze deletions through homologous recombination between nearby elements. With the ETT2 cluster, we see the whole spectrum of reductive evolution, from an apparently intact 27.5-kb cluster in O42 to just two residual gene fragments in S. flexneri (Fig. 2a). However, even though some TTSS genes within the cluster may be nonfunctional or missing in most strains, this does not mean that all genes within the cluster are without an effect. We have recently found evidence that some regulators encoded within the island exert a profound effect on other virulence-related loci in at least one strain (L. Zhang and M. J. Pallen, unpublished data). In this regard, we propose that a useful metaphor for the ETT2 cluster might be that of the grin of the Cheshire cat in Alice in Wonderland (5). We speculate that this metaphorthat powerful regulatory effects might outlive structural decay through mutational attritionmight also apply to other decaying prophages and pathogenicity islands.
We thank Gad Frankel, David O'Connor, and Mark Stevens for critical reading of the manuscript. We thank Arshad Khan for systems administration, and we are grateful to Michael Russell and Chengjie Liu for medium preparation. We thank Terry Alli for establishing PureGene chromosomal DNA preparations within our laboratory. We thank Debbie Mortiboy and other staff in the clinical microbiology laboratory of the Queen Elizabeth Hospital for help in obtaining local clinical isolates.
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»