Previous Article | Next Article ![]()
Journal of Bacteriology, March 2002, p. 1304-1313, Vol. 184, No. 5
0021-9193/02/$04.00+0 DOI: 10.1128/JB.184.5.1304-1313.2002
Copyright © 2002, American Society for Microbiology. All Rights Reserved.
Departament de Microbiologia i Parasitologia Sanitàries, Divisió de Ciències de la Salut, Universitat de Barcelona, E-08028 Barcelona, Spain
Received 12 July 2001/ Accepted 28 November 2001
|
|
|---|
|
|
|---|
In the human host the pathogenesis of V. cholerae involves the coordinated expression of a number of virulence factors, including cholera toxin (CT), which is directly responsible for the symptoms of the disease. The ctxAB operon, which encodes the A and B subunits of CT, is part of the genome of a lysogenic filamentous bacteriophage (CTX
). The receptor for CTX
, the toxin-coregulated pilus, is encoded by a larger genetic element, the toxin-coregulated pilus pathogenicity island, which is acquired by horizontal transfer (45).
Before 1992, only V. cholerae O1 was known to cause cholera epidemics. However, in September 1992 a severe cholera outbreak caused by a non-O1 strain, identified as serogroup O139, occurred in the Bay of Bengal. Several studies have shown that V. cholerae O139 is closely related to O1 ElTor, which is responsible for the seventh pandemic. According to several authors, O139 isolates have derived from a seventh-pandemic clone by horizontal gene transfer (5). This new serogroup rapidly spread through India and neighboring countries of Southeast Asia (1, 35). At first this new serogroup displaced the existing O1 strains in India and Bangladesh; however, a new clone of V. cholerae O1 biotype ElTor replaced the O139 vibrios during 1994 and 1995 (12). In 1996 a reemergence of V. cholerae O139 was reported in Calcutta, and this was the dominant serogroup until 1997. At present, this strain is still confined to Southeast Asia, and only imported cases have been detected in other countries across the globe. The seventh pandemic is still occurring throughout the world, and the number of countries affected continues to increase, especially in Africa (46). The transient disappearance and reemergence of the O139 vibrios have raised questions regarding the origin of the O139 strains and the clonal diversity among strains belonging to this serogroup (13).
Cholera outbreaks have recently been associated with climatic changes (29). The increment in cholera cases in recent years in Central and Southern Africa could be related to the phenomenon of El Niño. These changes marked the beginning of a series of concatenated events such as the increase of water temperature, nutrient concentration, and plankton growth that would multiply the number of cholera cases (24).
Population studies of V. cholerae based on the multilocus enzyme electrophoresis method (MLEE), considered that the O139 strains isolated during the first period (1992 to 1993) were a unique clone (4, 34). However, the application of molecular methods such as restriction fragment lenght polymorphism (RFLP), pulsed-field gel electrophoresis (PFGE), DNA sequencing, and amplified fragment length polymorphism (AFLP) (8, 13, 21, 23) have shown clonal diversity among O139 isolates and have revealed the existence of different ribotypes. More recently, comparative studies of O139 strains isolated in the two dominant time periods have shown that, although the reemerging strains of V. cholerae O139 (from 1996 to 1997) had biochemical traits identical to those isolated during the first period, their molecular characterization differed in the organization and number of the CTX element and in ribotype (2, 13, 14, 30).
Farfán et al. (11) applied the MLEE technique to study a collection of V. cholerae isolates from several countries and sources to determine the genetic relationships between pathogenic clones (O1 and O139) and environmental isolates. After analyzing the electrophoretic mobility of 15 housekeeping enzyme loci, the results showed considerable diversity within the O139 serogroup. To confirm the distinct clonal lineages among a set of O139 strains, we have developed a comparative nucleotide sequence analysis based on a scheme of the Multilocus Sequence Typing (MLST) method (25). This technique is an extension of MLEE in which the alleles at each housekeeping locus are assigned according to nucleotide changes detected by sequencing rather than the differences in the electrophoretic mobility of their gene products. We sequenced internal fragments of the DNA (
480 bp) of six housekeeping genes for 29 V. cholerae O139 strains isolated during the first period. A toxigenic O1 ElTor strain of the seventh pandemic and an environmental non-O1/non-O139 strain were also included. Our results revealed the existence of four distinct sequence types (STs) within the O139 population studied, thereby showing the different origins of this serogroup of V. cholerae.
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. Properties of V. cholerae isolates analyzed and their allele profiles at each locus
|
480-bp fragments of each gene on both strands. Gene fragments were amplified from chromosomal DNA of the 31 V. cholerae strains. PCR was performed in a 50-µl reaction mixture by using the following program: an initial denaturation step at 94°C for 5 min and then 35 cycles of denaturation (94°C for 45 s), annealing (48 to 55°C for 1 min), and extension (72°C for 1.30 min), followed by a final extension step at 72°C for 10 min. Amplification was carried out on a 2400 Gene Amp PCR Systems Thermal Cycler (Perkin-Elmer Corp., Norwalk, Conn.). PCR products were resolved by electrophoresis on 1% agarose gels made with 1x TBE (10x TBE consisting of 0.90 M Tris, 0.90 boric acid, and 20 mM EDTA) and containing ethidium bromide. Amplification bands were visualized by UV transillumination. The PCR primers and annealing temperatures used in this study are listed in Table 2.
|
View this table: [in a new window] |
TABLE 2. Sequences of primers used in PCR
|
Data treatment and statistical analysis. For each locus, the sequences obtained for all isolates were compared, and the different sequences were assigned arbitrary allele numbers. For each isolate, the combination of alleles obtained at each loci (Table 1) defined its allelic profile. We refer to a unique combination of alleles as an ST.
Allelic (haplotypic) diversity was calculated for each gene as follows:
![]() | (1) |
is the arithmetic mean of
j for the six genes analyzed. The nucleotide diversity (
) was calculated by using the DnaSP package, version 3.51 (Faculty of Biology, University of Barcelona [http://www.bio.ub.es/~julio/DnaSP.html]) (39), as follows:
![]() | (2) |
ij are the number of alleles, the frequency of ith or jth allele in the sample, and the proportion of distinct nucleotides between ith and jth alleles, respectively. Sequence alignments and comparisons were done with the CLUSTAL W Multiple Sequence Alignment program, version 1.8 (EMBL European Bioinformatics Institute [http://www.ebi.ac.uk]) (19), and gene trees were constructed by using the Jukes-Cantor distances method with the Molecular Evolutionary Genetics Analysis (MEGA) suite of programs, version 2.0 (22). The reliability of the gene trees obtained (Fig. 3) was determined by bootstrapping after 1,000 replications.
![]() View larger version (24K): [in a new window] |
FIG. 3. Gene trees constructed on the basis of Jukes-Cantor distances from sequences of each locus by using the UPGMA method. Construction and bootstrapping of the trees were carried out with the MEGA suite of programs. One thousand bootstrap replicates were performed for each analysis and bootstrap values are given at the branching nodes (representing the statistical reliability of nodes). Reference strains are shown at the end of each line, except for where we use the term "main group," which indicates the other strains of V. cholerae that do not appear in the tree. The scale bars indicate genetic distance and are presented below each tree.
|
![]() View larger version (31K): [in a new window] |
FIG. 4. Dendrogram constructed by the UPGMA method showing the genetic relationships among the five STs of V.cholerae isolates studied. All strains belong to the O139 serogroup, except for the CO487 and 25872 isolates, which belong to the O1 ElTor and non-O1/non-O139 serogroups, respectively. The scale indicates the linkage distance.
|
Two types of statistical analysis were applied to our data: the index of association (IA) and Sawyer's run test. Multilocus linkage disequilibrium was estimated by measuring the IA as described previously (11, 16). A Monte Carlo simulation was generated by randomly sampling alleles, without replacement, according to their respective frequencies at each locus. The standardized IA was calculated with the LIAN program, version 3.1 (18). The Sawyer's run test was performed according to a method described previously (41) with the START program. Using the START program, we also did a test to detect selection in our population, the dN/d S ratio, which was calculated as described by Nei and Gojobori (31).
Nucleotide sequence accession numbers. The GenBank accession numbers for the nucleotide sequences determined in this study are from AF343125 to AF343310.
|
|
|---|
![]() View larger version (9K): [in a new window] |
FIG. 1. Circular representation of the two chromosomes, I (large) and II (small), of V. cholerae. Genomic locations of the six housekeeping genes analyzed based on the complete genomic sequence of the V. cholerae O1 ElTor N16961 strain available in the databases (GenBank accession numbers AE003852 [chromosome I] and AE003853 [chromosome II]).
|
|
View this table: [in a new window] |
TABLE 3. Sequence variation at six loci
|
![]() View larger version (39K): [in a new window] |
FIG. 2. Polymorphic sites in each of the six gene fragments studied. The nucleotides present at each variable site among the 31 V. cholerae isolates are shown for allele 1. For the other alleles, only sites that differ are shown; sites that are the same as those in allele 1 are indicated by periods. Polymorphisms that are synonymous (S) and nonsynonymous (N) are indicated below the sequences. Sites are numbered above in vertical format.
|
The dN/dS ratio, where dN indicates the number of nonsynonymous substitutions per nonsynonymous site and dS indicates the number of synonymous substitutions per synonymous site, was calculated as a measure of the degree of selection in our population. This ratio was calculated for all six loci and in all cases, except for the lap locus, it was <10%. The dN/dS ratio of the lap locus (12.4%) could indicate that it is not under selection that is as strong as in the case of a typical housekeeping gene. On the other hand, the two loci which presented high values of nucleotide diversity and number of substitutions per nucleotide site (Table 3) were cadA and epd. The epd locus was a special case because all nucleotide substitutions were synonymous and did not affect the amino acid composition.
Comparative results of nucleotide sequence analysis versus MLEE.
In a previous study using MLEE (11) with the same strains, four of the six loci studied (asd, idh-II, lap, and mdh) showed a number of alleles per locus similar to that determined by nucleotide sequence analysis (Table 6), except for the lap locus, which exhibited a large number of alleles by MLEE. Table 6 also shows the values of allelic diversity (
) for all loci studied.
|
View this table: [in a new window] |
TABLE 6. Comparative data of nucleotide sequence analysis (NSA) versus MLEEa
|
Genetic relationships between STs. Table 1 shows the five allelic profiles or STs identified among the 31 V. cholerae isolates. A dendrogram constructed by using the UPGMA method from the matrix of pairwise differences between the allelic profiles of the V. cholerae population is shown in Fig. 4. This dendrogram shows correspondence between STs and clusters. The cophenetic correlation coefficient of the total sample was R = 0.83. The O139 isolates were distributed in several clusters (I, III, and IV). Cluster I contained two groups: one that included 25 of the 29 O139 strains and the O1 ElTor isolate (CO487) and a second group with an O139 isolate (CO404) that differed from the first group in only two loci (asd and lap). The remaining O139 strains grouped in two additional clusters, III and IV, which corresponded to ST2 (SO19) and ST4 (CO391, CO407, and 653/36). The non-O1/non-O139 isolate included in this study had a different allelic profile to all isolates (ST5, cluster II). All STs differed in various loci. The differences in the allelic profile of ST3 and ST4 were two and six loci, respectively, compared to ST1.
Evidence for recombination.
Linkage disequilibrium between alleles was estimated with the IA statistic (7, 27) (Table 4). The IA values found for all isolates were 3.81 (P < 0.0001) and 0.97 (P = 0.065) when we considered only STs. These results indicated a clonal structure of the sample studied. To save a dependence of the number of loci, Hudson described another statistic, the standardized IA, defined as follows:
![]() | (3) |
|
View this table: [in a new window] |
TABLE 4. Multilocus linkage disequilibrium analysis (IA) of the 31 V. cholerae strains studied
|
|
View this table: [in a new window] |
TABLE 5. Sawyer's test analysis for evidence of intragenic recombinationa
|
|
|
|---|
The factors that determine the emergence, disappearance, or continued presence of particular clones of V. cholerae are unclear. Some authors consider that O139 isolates pertaining to the first epidemic period (from 1992 to 1994) constitute a unique clone derived from an O1 ElTor strain (4, 21, 34). However, the application of molecular techniques has suggested the existence of greater genetic variability within the O139 population. These molecular analyses have identified two different ribotypes (14, 34) and four PFGE patterns (34) in O139 isolates of the first period. These results are in agreement with the diversity, determined by MLEE, that was observed in our O139 population (11).
The MLEE results allowed us to identify 26 electrophoretic types (ETs) from 29 V. cholerae O139 isolates belonging to the first period, which differed on average at six enzyme loci. Moreover, the mean genetic diversity (H) obtained by MLEE for this serogroup of V. cholerae was H = 0.40, which is substantially higher than that reported for some pathogenic bacterial species as a whole, e.g., Staphylococcus aureus (H = 0.289) and Legionella pneumophila (H = 0.312), or similar values such as for Neisseria gonorrhoeae (H = 0.410) and Escherichia coli (H = 0.433) (17, 33). The genetic diversity value for V. cholerae was higher than those reported in previous MLEE studies (4, 34), which described a more limited diversity among O139 isolates and considered this population a unique clone because they were clustered in a single ET.
In the present study, we carried out a comparative nucleotide sequence analysis of six housekeeping genes to verify the results previously obtained by MLEE. Sequence data analysis corroborated the diversity and the phylogenies inferred from the MLEE study. The differences found in the internal fragments of the six loci sequenced showed four distinct allelic profiles (ST1, ST2, ST3, and ST4) for V. cholerae O139, each with a very different pattern (Table 1).Two of these profiles (ST1 and ST4) were exhibited by 28 of the sample isolates (90.3%). The remaining STs were represented by a unique isolate (Fig. 4). Of the O139 isolates that clearly differ from the O1 and most O139 clinical isolates, three strains--CO391, CO407, and 653/36 (ST4)--are identical in the six sequences. They differ from ST1 in each gene and appear to be unrelated to the main set of 24 strains. The nucleotidic differences in ST4 range from 1 to 18 bases, which involve a total of 47 changes over 2,932 bp or 1.6%. This value is within the limits previously determined by Byun et al. (8) in mdh (1.52%) and hlyA (3.25%) from a wide range of unrelated V. cholerae strains. This implies a high diversity in the set of strains constituting ST4 that is consistent with an independent origin.
The number of strains that showed a defined allelic profile was higher than expected if we assumed the independence of different loci and was calculated as the product of allelic frequencies of alleles of which it is composed (P < 0.0001 based on a Monte Carlo resampling). The number of alleles per locus for the six loci studied ranged from three to four. This implies that the number of distinct allelic profiles that this scheme can resolve (between 36 and 46) is not high. However, it is unlikely that unrelated O139 isolates exhibit the same allelic profile by chance only. Nevertheless, from our data it is difficult to conclude whether the frequency of different genotypes reflects the population structure of V. cholerae O139 or whether this is the result of an epidemic explosion of a concrete allelic profile during the cholera outbreak as a consequence of the variable pathogenicity among the distinct genotypes of the serogroup.
The dendrogram obtained from distances between allelic profiles shows high genetic diversity in the O139 serogroup, as evidenced by the fact that ST1 and ST2 (cluster I) were more related to the non-O1/non-O139 isolate (cluster II) than to other O139 isolates (clusters III and IV). Interestingly, the O1 ElTor isolate (CO487) grouped in cluster I, which included 25 of the 29 O139 isolates, suggesting a closer genetic relationship with this cluster, in agreement with previous studies about the possible origin of the O139 serogroup (5, 42).
The comparison of results obtained by MLEE and sequence analysis is nonlineal because these methodologies studied the genetic characteristics of bacterial population at different levels: enzymatic activity (MLEE) and nucleotide polymorphism (sequence analysis). Using MLEE and sequence analysis to compare the number of alleles obtained and considering only the four coinciding loci, we found a good correlation, except for the lap locus, which presented a higher allelic diversity in the MLEE study (Table 6). These differences could be explained by the fact that in MLEE we revealed the activity of the entire gene, whereas in the comparative sequence analysis we only sequenced a partial region of each gene (ca. 480 bp) and therefore did not detect changes in the unsequenced gene fragment. A high degree of diversity in MLEE data compared with those obtained from DNA sequences has previously been reported for Salmonella group I strains (32) and Neisseria meningitidis serogroup A (15).
Previous studies in V. cholerae have analyzed the sequences of two of the genes we studied here: asd (21) and mdh (8). Karaolis et al. sequenced a fragment of 931 bp of the asd gene for 45 V. cholerae isolates (O1, O139, and non-O1/non-O139 strains from clinical and environmental sources) and found that all but one of the clinical isolates shared the same asd sequence. When we sequenced a 495-bp fragment of the same gene we found two distinct sequences (allele 1 and allele 2) among the 29 O139 isolates. Most of the strains (25) exhibited an identical sequence (allele 1), and the remaining O139 isolates (allele 2) differed in a unique nucleotide change compared to allele 1. It is particularly noteworthy that in the gene fragment sequenced there is a region with seven nucleotide differences (positions 160 to 167) that result in two amino acid changes (Asn instead of Gln and Thr instead of Ala). These changes were also determined by Karaolis et al. (positions 561 to 567 of their sequences) among the cluster corresponding to the V. cholerae sixth-pandemic isolates, and these authors considered this region to be a typical trait of these isolates. We identified this fragment in the non-O1/non-O139 isolate (i.e., isolate 25872). Meanwhile, the remaining strains (O139 and O1 ElTor) showed the same sequence as those of the pathogenic strains of Karaolis et al.
A previous study (8) determined the nucleotide sequence of 936-bp coding region of mdh for 33 isolates of different serogroups of V. cholerae. Although this earlier study only included two O139 strains, it determined that the mdh sequences were identical for the sixth and seventh pandemics and for O139 isolates. These pathogenic strains were represented by the M793 sequence (Fig. 5). In the present study we sequenced a 495-bp fragment of the mdh gene and detected a higher diversity in this locus. Alignment of sequenced fragments showed three distinct groups of sequences (mdh1, mdh2, and mdh3) within our O139 isolates. The mdh1 sequence, corresponding to 25 of the O139 strains, the O1 ElTor isolate and the non-O1/non-O139 isolate, was identical to Byun's clinical isolates (M793) (Fig. 5). In contrast, the mdh2 (three O139 strains) and mdh3 (one O139 strain) sequences differed, respectively, in six and seven nucleotide positions from the mdh1-M793 sequences. A possible explanation for the apparent incongruence between our results and those obtained by Byun et al. (8) is that we analyzed a higher number of O139 isolates (29), and this may have allowed us to find great diversity in the gene fragment sequences. The different geographical origin of strains used in both studies could also influence the results.
![]() View larger version (10K): [in a new window] |
FIG. 5. Dendrogram generated by the UPGMA method for the V. cholerae mdh locus from the different sequences obtained for O139 V. cholerae strains in this study (alleles 1, 2, and 3). Two sequences previously described by Karaolis et al. (21), which correspond to all pathogenic isolates analyzed in their study (O1 classical, O1 ElTor, and O139) and represented by M793 and M645, are also included.
|
The genealogies observed in the gene trees and the high values obtained for the statistical stability of nodes by bootstraping suggest that the two main clusters of isolates constitute two divergent clones of V. cholerae O139 that have evolved independently. Only three strains, two O139 strains (CO404 and SO19) and the non-O1/non-O139 strain (isolate 25872), show genealogies inconsistent with the clustering of our O139 population, and therefore we cannot exclude the possibility of recombinational events in these three strains.
The homoplasy test (28) measures the importance of recombination between members of a population. It is only valid when sequences differ by
5% of the nucleotides and requires a sufficient number of alleles and informative sites to yield interpretable results. In our case, all loci analyzed had insufficient number of alleles and informative sites to perform this analysis.
Sawyer's test did not detect clear evidence of recombination in any of the loci analyzed, including the asd locus, in which we identified a small region of 7 bp, previously reported by Karaolis et al. (21) as a result of a recombinational event, in the non-O1/non-O139 isolate. In our analysis the Sawyer's test failed to detect recombination. The reason for this disagreement can be found in the definition of condensed fragments themselves, in the Sawyer's test, which are designed to detect gene conversion in which both source and target sequences are in the sample. The only possible exception was the cadA locus, in which the results obtained by Sawyer's test are in agreement with split decomposition analysis. From our data only this locus showed a net structure in the split graph, suggesting the existence of recombination. Nevertheless, the split graphs generated in the other loci should be interpreted with care because of the low number of alleles obtained.
To detect the existence of some degree of association between the alleles determined by the sequencing of the six housekeeping genes, we calculated the IA (7, 27). The IA was significantly different from zero when we considered all isolates (IA = 3.81 ± 0.24, P < 0.0001) and when STs were taken (IA = 0.97 ± 0.62, P = 0.065). These results are consistent with the existence of strong linkage disequilibrium between the alleles and suggest a clonal population structure with the presence, if at all, of a low degree of recombination. However, when we considered the STs defined in our sample, only slight evidence was obtained against the null hypothesis of random distribution of alleles. These values should be taken cautiously because of the low number of STs (five) obtained. To avoid influence of the number of loci in the population analysis, we also calculated the standardized IA, a parameter that could be more valuable because it modifies the IA statistic according to the number of loci analyzed. In this case, IAS was obviously lower in both cases: for all isolates (IAS = 0.762) and for STs (IAS = 0.194).
In conclusion, our results demonstrate the existence of at least three distinct clones among the V. cholerae O139 isolates studied. It is interesting that we have determined a minimum of three different origins among the O139 strains belonging to the period from 1992 to 1993, whereas in the more extensive studies of O1 V. cholerae isolates carried out over many years, only the related ElTor (seventh-pandemic) and classical (sixth-pandemic) forms have been seen, with no indication of other unrelated forms. Our results are in agreement with those of a previous MLEE study (11) and support the argument in favour of the polyclonal hypothesis of the O139 serogroup (4, 21, 34). The majority of O139 strains analyzed (24 isolates) showed a close relationship with the O1 ElTor isolate included in this study, whereas others were genetically divergent. The observation that the two main groups of strains (ST1 and ST4) exhibited differences in the allelic profile of all six genes spaced around both chromosomes of V. cholerae suggests that V. cholerae O139 emerged independently from different progenitor strains that have acquired the ability to produce an infectious process, which is in agreement with the origin of pathogenic strains in other bacterial species such as E. coli (37). The variation within the O139 strains does raise the possibility that O139 human pathogenic strains were around some time before the Bengal outbreak was produced, since minor forms could have gone undetected until one form came to dominate for a period. However, it is also possible that V. cholerae strains from a different origin gained the O139 O antigen as part of their adaptation to the pathogenic mode.
Although the presence of a certain degree of recombination cannot be excluded, our data are consistent with a clonal population structure of this group of bacteria. However, a strong linkage disequilibrium can be explained considering that, like many bacteria, V. cholerae forms a metapopulation integrated by multiple ecological populations (38) which occupy different ecological niches, and recombination, although possible within populations, is rare or absent between distinct populations (26, 36, 47). The presence of two main genetic backgrounds in the sample studied, ST1 and ST4, could contribute to bringing out recombinant events in the population.
The six loci chosen in this study constitute a suitable basis for an MLST typing scheme, extending the old sequence data to perform a subsequent population genetics analysis and to determine the origin of pathogenic strains of V. cholerae.
Maribel Farfán is the recipient of a grant "Formació en la Recerca i Docència per a alumnes de tercer cicle" from the University of Barcelona. This work was supported by a grant from Vicerectorat de Recerca of the University of Barcelona.
|
|
|---|
precursor and evidence for independent acquisition of distinct CTX
s by toxigenic Vibrio cholerae. J. Bacteriol. 182:5530-5538.
This article has been cited by other articles:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»