Previous Article | Next Article ![]()
Journal of Bacteriology, December 2007, p. 9145-9149, Vol. 189, No. 24
0021-9193/07/$08.00+0 doi:10.1128/JB.00722-07
Copyright © 2007, American Society for Microbiology. All Rights Reserved.
,
Division of Infectious Diseases, Department of Pediatrics, University of Iowa Children's Hospital, Iowa City, Iowa,1 Department of Microbiology, University of Washington, Seattle, Washington,2 Division of Infectious Disease, Immunology, and Rheumatology, Department of Pediatrics, University of Washington, Seattle, Washington,3 Department of Immunobiology, University of Arizona, Tucson, Arizona,4 Department of Microbiology, University of Maryland, Baltimore, Maryland5
Received 7 May 2007/ Accepted 20 September 2007
|
|
|---|
|
|
|---|
Three plasmid libraries were constructed by partial restriction endonuclease digestion of the Longus-encoding virulence plasmid from ETEC strain E9034A. Plasmids were screened by PCR with lngA-derived primers and then other lng-specific primers (the plasmids used in this study are listed in Table 1). The Longus gene cluster obtained from ETEC strain E9034A is 14 kb in length and contains 16 open reading frames. Fourteen genes share considerable homology, as well as cluster topology, with CFA/III genes and are thus designated, in homology with the cof cluster, lngR, lngS, lngT, lngA, lngB, lngC, lngD, lngE, lngF, lngG, lngH, lngI, lngJ, and lngP (Fig. 1). Gene length, homology to known proteins, and putative functions are presented for each gene in Table 2. Two open reading frames, lngX1 and lngX2, were identified that have no homology to CFA/III pilin genes.
|
View this table: [in a new window] |
TABLE 1. Host strains and plasmids used or generated in this study
|
|
View larger version (12K): [in a new window] |
FIG. 1. Homology of Longus and CFA/III gene clusters. Arrows indicate the direction of transcription. Genes in black correspond to major structural subunit genes.
|
|
View this table: [in a new window] |
TABLE 2. Biosynthetic Longus genes, putative functions, gene homologues, and allelic variations
|
The lngA genes from the different ETEC strains were sequenced, multiple sequence alignments were performed with ClustalX 1.83 (30), and then PAUP* 4.0b (26) was used to construct maximum-likelihood-based phylogenetic trees. The lngA alleles segregated into three distinctive phylogenetic groups defined as groups 1, 2, and 3 (Fig. 2a). Primary (the most basal) nodes of groups 1 and 2 differed in 103 nucleotides (19%), leading to 22 amino acid changes (12%). Groups 1 and 2 differed from group 3 in 142 (26%) and 125 (23%) nucleotides, respectively, each leading to 34 (19%) amino acid changes. Alleles of the cofA gene in the two cofA-positive strains were also highly divergent from each other, with 137 nucleotide differences (24.8%) leading to 46 residue changes (25%). Although both allelic variants of cofA were more closely related to each other than to any of the lngA variants, the phylogenetic separation was not distinct (Fig. 2a). For example, the lngA group 1 and group 3 variants are phylogenetically more distant from each other than the lngA group 3 and cofA group 2 variants, obscuring evolutionary distinction of the Longus and CFA/III fimbrial structures.
![]() View larger version (13K): [in a new window] |
FIG. 2. Maximum-likelihood phylograms of Longus sequences. (a) Unrooted phylogram of lngA sequences. lngA sequences from 21 ETEC strains along with cofA sequences from CFA/III ETEC strains were included. (b) Phylogram of concatenated nonstructural genes in the Longus (based on a subset of six strains) and CFA/III gene clusters. The intra-cof (260-1/E2528C/M403-C1) comparison of nonstructural genes is based on 4 out of 13 loci (cofC, -E, -H, and -P). (c) Phylogram of concatenated adk and fumC genes from 21 ETEC strains. The branch lengths refer to numbers of nucleotide differences. For panels A and B, two different length scales are used. The symbol # indicates strains in which the non-lngA genes were sequenced (see panel B). Homologous lngA allelic groups are denoted by underlined (group 1), bold (group 2), and plain (group 3) text. Strains from which sequences were derived are represented by three-letter designations. lngA group 1 strains: E9034A, L1A; 01117-5, L1B; B2C, L1C; M104, L1D; M145-C2, L1E; M424-C1, L1F; M526-C6B, L1G; P307, L1H; M408-C1, L1I. lngA group 2 strains: 10159-a, L2A; 11381a, L2B, 2108-2, L2C; B7A, L2D; G1026, L2E; M445-C1, L2F; M452-C1, L2G; M626-C, L2H; M633-C1, L2I; MP215-1, L2J; BR5, L2K. lngA group 3 strain ECOR27, L3A. cofA group 1 strains: 2528C, C1A; 260-1, C1B. cofA group 2 strain M403-C1, C2A.
|
Alignment of predicted LngA protein sequences (182 amino acids long) revealed that the between-group diversity of the protein is clustered in three regions, amino acids 56 to 77, 104 to 121, and 148 to 182 (see Fig. S1 in the supplemental material). Most of the within-group point replacements are located in these diversity regions as well. Unlike the amino acid diversity, the diversity of silent changes across lngA does not exhibit distinct clustering (see Fig. S1 in the supplemental material). The distribution of amino acid polymorphisms between the LngA and CofA protein sequences (as well as between the two CofA groups) shows less distinct clustering but reflects the general pattern of diversity seen among the LngA groups. Both LngA and CofA represent major surface subunits and thus antigens expected to be under strong selection for structural diversification. In particular, surface epitope regions would be prone to accumulate extensive structural changes while regions that are critical for proper tertiary structure or fimbrial morphology and/or function would remain conserved.
Unlike the structural subunit genes, nonstructural Longus genes are highly conserved (determined for the entire gene cluster in a subset of six strains), with more than 98% identity among homologs at the DNA (Fig. 2b) and protein (data not shown) levels. Homologs of nonstructural genes of the two CFA/III clusters (determined for cofC, -E, -H, and -P) were also conserved (99% identical to each other, on average), and after concatenation, they formed a distinct outgroup relative to the Longus genes (Fig. 2b), which is in contrast to the lngA/cofA tree (Fig. 2a). This indicates that while Longus and CFA/III pili are evolutionarily related, the cof and lng genes are paralogous in nature; i.e., they belong to evolutionarily distinct types of fimbriae.
To assess phylogenetic congruence among different lng loci, we performed an incongruence length difference test (8) with PAUP* 4.0b (26) that compared the sum of the branch lengths of a given pair of trees with the sum of the branch-lengths obtained through 1,000 random partitionings of the original data sets. Individual phylograms for the nonstructural genes were congruent with each other, providing no evidence of recombinational shuffling of alleles among different clusters (not shown). In contrast, a gene tree of concatenated nonstructural genes (Fig. 2b) was not congruent with the tree of corresponding lngA alleles (P = 0.001), suggesting that the major structural subunit gene is moving horizontally among the fimbrial gene clusters.
To determine the populational and intergenomic dynamics of the Longus gene clusters, we also sequenced two housekeeping loci—adk and fumC—in different ETEC strains. Housekeeping genes constitute the "backbone" of the bacterial chromosome, subject to recombinational events only infrequently in the species E. coli, providing the basis for phylogenetic and clonal grouping of E. coli strains (34). Both loci demonstrated high levels (98.1% for fumC and 99.4% for adk) of identity at the DNA level (Fig. 2c) that are within the range shown for housekeeping genes of E. coli at the species level (23). No congruence was detected between the concatenated adk/fumC tree and the lngA groups (P = 0.001), indicating frequent horizontal transfer of the lngA genes among ETEC strains. The lngA alleles have shown incongruence even with the major phylogenetic clusters of E. coli—ECOR groups A and B1—to which the vast majority of the ETEC strains studied belong (determined by the Clermont method [7]; see Table S1 in the supplemental material), indicating gene movement among distant clones as well. At the same time, the trees of adk/fumC and nonstructural lng genes were congruent with each other (the incongruence length difference test P value equals 1) for the same sample set, indicating that the nonstructural genes are significantly less prone to transfer than lngA. Thus, it appears that the main mechanism for horizontal transfer of lngA among ETEC strains involves single-gene recombination rather than transfer of the entire Longus-carrying plasmid.
Sequence analysis of the fimbrial genes also provides insight into the evolutionary history of Longus. It appears that the highly conserved nonstructural lng genes evolved in a manner similar to that of housekeeping genes. It involved accumulation of point mutations (rather than recombination) and purifying selection against amino acid replacement changes, the latter based on the low ratio of nonsynonymous changes (Dn as determined according to reference 35 by DnaSP software) to synonymous, silent changes (Ds) seen in most of the genes (Table 2). Silent changes are considered to be functionally neutral and accumulate randomly over time, reflecting the molecular clock. We used two different molecular clock rates, 3E-8 changes/year (16) and 6E-9 changes/year (33). Thus, on the basis of nonstructural gene diversity, Longus and CFA/III fimbriae were acquired by E. coli around 0.73 to 3.64 and 0.97 to 4.83 million years ago, respectively (16, 33), i.e., probably at some time after E. coli speciation, which is estimated to have occurred around 5 to 6 million years ago (19).
The synonymous and nonsynonymous diversity of lngA and cofA alleles is approximately 10-fold higher than that of corresponding nonstructural genes (Table 2). The high rates of both nonsynonymous and synonymous changes in the major subunit genes could be due to a high rate of either nonhomologous intragenic recombination (gene shuffling) or point mutations in the structural genes comparative to the nonstructural ones. Alternatively, this could be explained by parallel evolution of the structural allelic variants for extended periods of evolutionary time, with the lngA (and cofA) gene groups starting to diverge at the approximate time of the divergence of the Longus and CFA/III gene clusters from each other 20 to 100 million years ago, i.e., well before E. coli speciation. We propose that lng and cof clusters were acquired by E. coli in separate unique events after the time of speciation, with this evolutionary bottleneck explaining both the distinct phylogeny and low diversity of the lng and cof nonstructural genes. Afterwards, there were continuous horizontal-transfer events acting only on the major subunit genes lngA and cofA, both between and within species, driven by strong pressure for antigenic diversity.
Despite the antigenic diversity of lngA and cofA, the present study indicates that there is significant structural conservation between the group variants, allowing for likely cross-reactivity among them and thus the development of a Longus-CFA/III immunoprotective multicomponent vaccine against ETEC diarrheal disease.
Nucleotide sequence accession number. Newly described DNA sequences reported in this study were deposited in the GenBank database. The entire Longus cluster DNA sequence was assigned accession number EF595770. The cofA and lngA gene DNA sequences were assigned accession numbers EU107087 to EU107107.
We are grateful to Steve L. Moseley for providing E. coli strains for this study. We thank Jing Bai for technical assistance with DNA experiments.
Published ahead of print on 19 October 2007. ![]()
Supplemental material for this article may be found at http://jb.asm.org/. ![]()
|
|
|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»