Previous Article | Next Article ![]()
Journal of Bacteriology, April 2008, p. 2831-2840, Vol. 190, No. 8
0021-9193/08/$08.00+0 doi:10.1128/JB.01808-07
Copyright © 2008, American Society for Microbiology. All Rights Reserved.
,
Department of Food Science, North Carolina State University, Raleigh, North Carolina,1 Instituto de Acuicultura, Universidad de Santiago de Compostela, Campus Universitario Sur, 15782 Santiago de Compostela, Spain,2 Laboratorio de Biotecnología, Instituto de Nutrición y Tecnología de los Alimentos, Universidad de Chile, Santiago, Chile,3 Gulf Coast Seafood Laboratory, Food and Drug Administration, Dauphin Island, Alabama,4 Center for Food and Applied Nutrition, Food and Drug Administration, College Park, Maryland5
Received 15 November 2007/ Accepted 5 February 2008
|
|
|---|
|
|
|---|
Multilocus sequence typing (MLST) is based on sequence analysis of chosen housekeeping (HK) genes and is becoming the method of choice for determining the global epidemiology of bacterial pathogens (e.g., Neisseria meningitidis and Staphylococcus aureus) (29, 30). Being sequence-based, MLST provides a definitive characterization of bacterial isolates that is consistent from one laboratory to the next. The nuclei acid sequences are typically stored in a public database that can be readily accessed via the Internet (http://www.mlst.net or http://pubmlst.org/). Previous MLST studies have lead to better understanding of the genetic relatedness of strains within a species and have identified the relative evolutionary importance of mutations and lateral transfer events (13, 16, 35).
A previous MLST study was conducted to investigate the evolution of the new pandemic strains of V. parahaemolyticus, and the authors hypothesized their evolution from a common O3:K6 ancestor (9). That study was limited primarily to pandemic strains and examined only four genes, all located in chromosome I. Usually six to eight genes are examined to provide a more comprehensive picture of the genetic characteristics of the organism analyzed (29). Additionally, Vibrio spp. (including V. parahaemolyticus) possess two chromosomes, and it would be useful to determine whether both of these are subjected to similar evolutionary pressures. Thus, the development of a more comprehensive MLST scheme for V. parahaemolyticus and one which is comparable to that used for other bacteria is warranted.
We report on the first MLST scheme for V. parahaemolyticus using sequences of internal fragments of seven HK genes. In order to better represent their population structure, three genes from chromosome I and four from chromosome II were chosen. A well-characterized and geographically diverse set of V. parahaemolyticus isolates were examined. Isolates belonging to the V. parahaemolyticus pandemic clonal complex isolated from eight countries in four continents were also analyzed.
|
|
|---|
DNA extraction and PCR amplification. Bacterial DNA was extracted using the DNeasy kit in accordance with the manufacturer's instructions (Qiagen, Valencia, CA). PCR amplification was carried out using primers (IDT, Coralville, IA) detailed on the V. parahaemolyticus MLST website (http://pubmlst.org/vparahaemolyticus). The seven loci analyzed by MLST were dispersed on both chromosomes. These genes were chosen based on two previously published reports of MLST data for V. parahaemolyticus and for V. vulnificus (3, 9). For chromosome I, the HK genes chosen were recA (RecA protein), dnaE (DNA polymerase III, alpha subunit), and gyrB (DNA gyrase, subunit B). For chromosome II, the HK genes were dtdS (threonine 3-dehydrogenase), pntA (transhydrogenase alpha subunit), pyrC (dihydro-orotase), and tnaA (tryptophanase). PCR conditions were as follows; denaturation at 96°C for 1 min, primer annealing at 58°C for 1 min, and extension at 72°C for 1 min for 30 cycles, with a final extension step at 72°C for 10 min. In a few cases the temperature of annealing was decreased to obtain PCR product. Reagent concentrations in 100 µl of PCR mixture were 1.5 mM MgCl2, 0.125 mM deoxynucleoside triphosphates, 0.5 µM each primer (forward and reverse for each locus), and 1 U of Platinum Taq DNA high-fidelity polymerase (Invitrogen, Carlsbad, CA). One nanogram of DNA was used as template per PCR, and PCR products (10 µl each) were analyzed on 2% agarose gels run at 75 V for 1:30 h in 1x Tris-acetate-EDTA with a molecular weight standard (Lambda DNA/HindIII; Promega, Madison, WI) for determining amplicon concentrations. Amplification products were visualized by ethidium bromide staining. All PCR products consisted of a single band, and these were cleaned using the QIAquick PCR purification kit (Qiagen). PCR products were sequenced in both directions by Mclab (South San Francisco, CA) with primers M13F and M13R. DNA sequences were individually inspected and manually assembled. The alignments of these sequences were determined using BioEdit (21). Numbers for alleles and sequence types (STs) were assigned according to the database created for V. parahaemolyticus (http://pubmlst.org/vparahaemolyticus) (25).
Assignment to clonal complexes. The program eBURST v 3.0 was used to identify the different clonal complexes (http://eburst.mlst.net). The most restrictive group definition was used to define the clonal complexes, i.e., at least six of the seven alleles had to be identical to be included in the same group or clonal complex (15). The statistical confidences for the ancestral types were assessed using 1,000 bootstrap resamplings. Two different STs are considered single-locus variant (SLV) when they differ from each other at a single locus. Double-locus variants (DLVs) are any two different STs differing in two loci.
Phylogenetic analysis. Minimum-evolution (ME) trees for each locus and for the concatenated sequences of each ST (3,682 bp) were constructed by Mega 3.1 software (27) using the Kimura two-parameter model to estimate the genetic distances. The statistical support of the nodes in the ME tree was assessed by 1,000 bootstrap resamplings. The nucleotide diversity of each locus and their respective standard errors were determined using Mega 3.1 as described elsewhere (30).
Estimates of recombination rates. Estimation of recombination rates was done as described previously (14, 35), where the per-allele and per-site recombination/mutation (r/m) parameter was calculated empirically. Briefly, any SLV allele differing by one nucleotide and not observed elsewhere in the database as part of another ST was considered to have arisen by mutation. An SLV allele differing by multiple nucleotides or containing a single nucleotide change observed as part of another ST in the database was considered to have originated by recombination.
Test for recombination.
The START version 2.0 software package (26) was used to calculate the "standardized" index of association (ISA) (22). This statistical method tests for the null hypothesis of linkage equilibrium; i.e., if ISA = 0, then alleles are independently distributed at all loci analyzed (alleles are in linkage equilibrium) and recombination occurred frequently. The ratio between the numbers of synonymous (dS) and nonsynonymous (dN) substitutions was calculated by the method of Nei and Gojobori with the Jukes-Cantor correction implemented in Mega 3.1 (27). This measures the type of selection occurring at each locus. The hypothesis tested was for neutrality (dS = dN); if dS/dN > 1, then nonsynonymous sites are under selective constraint or purifying pressure (negative selection); dS/dN < 1 indicates positive selection, and dS/dN = 1 indicates neutrality. Congruence among the seven genes was determined as described by Brown et al. (4) employing the incongruence length difference (ILD) test (11). The version of the ILD test employed here is available in PAUP* v.4.0b (41). An exception was that for the ILD tests a heuristic model was performed instead of branch and bound searches. Both split-tree generation for individual loci and the phi test (
w) for recombination were done using the SplitsTree v 4.8 software (23).
Nucleotide sequence accession numbers. recA, dnaE, gyrB, dtdS, pntA, pyrC, and tnaA sequences were deposited in GenBank under accession numbers EU051383 to EU051622 and are also available at http://pubmlst.org/vparahaemolyticus.
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. Nucleotide sequence variation for each MLST locus
|
|
View this table: [in a new window] |
TABLE 2. Sequence types, allele profiles, and geographic locations of the V. parahaemolyticus strains analyzed
|
![]() View larger version (8K): [in a new window] |
FIG. 1. V. parahaemolyticus "population snapshot" obtained using eBURST v3. Nine groups were defined using stringent criteria (6/7 shared alleles). Among those nine groups, three were identified as clonal complexes, and their predicted clonal ancestors are shown in blue (ST-3, ST-36, and ST-34 are the founders of each complex). STs that are SLVs of each other are shown connected by black lines. DLV STs are shown connected by aqua lines. The sizes of the circles are relative to the number of strains in the ST.
|
Most of the nonpandemic V. parahaemolyticus isolates were isolated in the United States (n = 54) and Chile (n = 12). All Chilean isolates were environmental isolates in the region of Puerto Montt in the southern region of the country. They all belonged to different STs, although some of them showed a degree of genetic relatedness (shared one or two alleles' types [e.g., ST-6 and ST-7, ST-10 and ST-1]) (Table 2). U.S. V. parahaemolyticus strains originated from the Pacific (including Alaska), Gulf, and Atlantic coasts. With the possible exception of strain NY-3483 (ST-36), which was isolated from a patient in New York who consumed oysters of unknown origin, isolates belonging to CC36 originated from the Pacific coast. Most of the isolates belonging to this clonal complex were of the O4:K12 serotype, a serotype historically associated with V. parahaemolyticus outbreaks corresponding to the consumption of raw oysters harvested from the U.S. Pacific coast (1). Strains belonging to CC34 originated principally from Gulf oysters. However, ST-34 isolates originating from both Louisiana and the Atlantic coast (Massachusetts) were noted. Based on allelic profiles, two groups isolated from Gulf oysters (D1 and D2) appeared to belong to the same clonal complex. These were not previously detected by eBURST as belonging to the same clonal complex because the ancestral type was missing or extinct (see "Clustering and phylogenetic analysis" below). A Connecticut oyster isolate (ST-53) was an SLV of ST-49 isolated in Chile. Other than the pandemic V. parahaemolyticus isolates, this was the only instance of an SLV occurring in two distant geographical locations.
Contribution of mutation and recombination to clonal diversification. (i) r/m parameter. Only 14 SLVs were observed in our data, and 7 belonged to the three clonal complexes. Most SLVs (10) arose from a recombination event, whereas only 4 arose by mutation (Table 3). This resulted in a per-allele r/m parameter of 2.5:1. In the case of the per-site analysis, the r/m parameter ratio was 8.8:1. These two parameters suggest that the initial steps of V. parahaemolyticus clonal diversification at allele or individual nucleotide sites are 2.5- and 8.8-fold more likely to occur by recombination than by point mutation, respectively.
|
View this table: [in a new window] |
TABLE 3. SLV allele variants found among the three clonal complexes and the different groups identified by eBURST in this study and identification of the events responsible for their evolution
|
Clustering and phylogenetic analysis. In order to validate the clustering and evolutionary model generated by eBURST, an ME tree was generated from the concatenated sequences of the seven loci of the 62 STs. For the most part, clustering, eBURST and ME tree analysis were consistent with one another, but ME tree gave better resolution and uncovered some phylogenetic relationships among groups or singletons not observed or resolved by eBURST (Fig. 2).
![]() View larger version (14K): [in a new window] |
FIG. 2. An ME tree was constructed using the concatenated sequences of the seven loci of each of the 62 STs obtained in this study. Squares, circles, and triangles with different shading represent the three clonal complexes and the six doublets observed by eBURST. The scale represents the evolutionary distance, and bootstrap values over 50% are shown in the branches.
|
A test for incongruence was conducted in order to determine the impact of recombination on the V. parahaemolyticus populations analyzed. The ILD testing for all seven HK genes when partitioned separately led to the conclusion of incongruence (P = 0.001). To facilitate the isolation of the gene(s) responsible for the incongruence, each of the seven genes was partitioned against a combined matrix consisting of the remaining six. All genes were highly incongruent with the combined HK matrix (P = 0.001), suggesting that each of the seven genes contribute significantly to the overall signal of incongruence observed for the entire data set. Furthermore, when all possible pairwise ILD comparisons were performed, all of them exhibited incongruence among themselves (P = 0.001). Taken as a whole, all of this indicates that recombination in those seven HK genes is probably frequent in V. parahaemolyticus.
Split trees were generated in an effort to visualize the impact of recombination in each locus and to verify the results obtained with the ILD tests (Fig. 3; see Fig. S1 in the supplemental material). The split trees generated for each locus, as well as for the concatenated sequences, showed reticulated structures. The majority of the strains also showed a star phylogeny radiating from the same central point, which suggests frequent recombination (23, 37). Furthermore, the phi test for recombination was significant in all loci analyzed (P < 0.05) (see the supplemental material).
![]() View larger version (29K): [in a new window] |
FIG. 3. Split decomposition analysis of the concatenated sequences of the seven chosen loci for the 62 STs obtained in this study.
|
|
|
|---|
The MLST scheme described here is based on the allelic variation of seven HK genes, three from chromosome I (recA, dnaE, and gyrB) and four from chromosome II (dtdS, pntA, pyrC, and tnaA). The dS/dN ratios were higher than 1 for all the genes analyzed, indicating that they are under purifying pressure such that most amino acid substitutions are deleterious (6). Among the 100 V. parahaemolyticus strains used in this study; 62 different allelic combinations and an average of 33 alleles per locus were identified, indicating a high degree of genotypic diversity at slowly evolving loci. Three of the loci from chromosome II displayed lower nucleotide diversity (0.012 to 0.013) than the loci analyzed from chromosome I (0.029 to 0.032). It appears that the HK genes from chromosome II chosen for this study may be under different selective pressure than other regions of this chromosome.
This MLST scheme identified three major clonal complexes and six minor groups. Further sequence analysis (ME tree of concatenated sequences) shifted some groups into clonal complexes (e.g., D6 into CC36 and D1 and D2 into a potential clonal complex that were not identified by eBURST). The absence of the SLV strain linking these groups (D1 and D2) obscured this finding by eBURST. Additionally, numerous singletons were observed, demonstrating the high discriminatory capability of this scheme. The chosen genes appear to be well suited for broader population structure studies of V. parahaemolyticus. The observation of a high degree of genetic diversity in this study may be partially attributed to selection of particular V. parahaemolyticus strains in our collection. However, V. parahaemolyticus occurs in different habitats, such as seawater, sediment, gastrointestinal tracts of fish, chitin of zooplankton, etc., where nutrient contents, temperatures, and other physiochemical properties (e.g., pH) differ. Shifts in these parameters require frequent bacterial adaptation. These environments are often heavily populated with phages that can facilitate gene transfer among vibrios and other marine bacteria (24). We encourage other investigators to populate the database with data generated from V. parahaemolyticus strains isolated from other sources and geographical locations in order to further delineate the extent of diversity within the species.
In the current study, ST-3 was the only ST with an international distribution, and this was determined to be the ancestor of CC3. The other SLVs within CC3 were also internationally distributed (Korea, Bangladesh, and the United States) and were identical to ST-3 except for differences in dnaE, apparently resulting from different recombination events. Similar findings were reported by Chowdhury et al. (9) using an MLST scheme based on the fragments of four genes from chromosome I. Those investigators showed that 51 of 54 V. parahaemolyticus pandemic isolates were indistinguishable in the four loci analyzed and that the 3 remaining isolates differed only in the recA locus. Variability in the dnaE gene in the current study and in the recA gene in the previous study suggests that these genes are evolving more rapidly in the pandemic clonal complex, but analysis of a much larger set of strains would be necessary to confirm this observation. The analysis of seven genes instead of four genes as done in previous studies further confirms the homogenous nature of the pandemic clonal complex, independent of geographical site of isolation. While new variants are arising, they apparently have not yet become well established and are not replacing, to any significant degree, the ancestor type (ST-3) that continues to cause outbreaks in some countries (e.g., Chile in 2006) (17).
The persistence of the same ST over extended periods (e.g., ST-3 in numerous countries from 1996 to 2005 and ST-36 and ST-50 in the on U.S. Pacific coast from 1988 to 1997 and 1997 to 2004, respectively) is indicative of a clonal population structure. The clonal population structure is also supported by the existence of significant linkage disequilibrium between the MLST alleles (IAS = 0.7626; P < 0.05). The IAS decreased when pandemic strains were excluded or the analysis was limited to CC36. This phenomenon has also been reported by Miragaia et al. (35) for Staphylococcus epidermidis. However, the almost threefold per-allele and ninefold per-site higher frequencies of recombination events relative to mutations are uncharacteristic of highly clonal bacteria. A history of frequent recombination events is further supported by the lack of any observed congruence among the genes analyzed (4). Furthermore, split trees for both concatenated sequences and individual loci resulted in star phylogeny structures radiating from the same central point, which suggest frequent recombination (37). Statistically significant evidence for recombination at each locus was also detected with the phi test (23). These apparently contradictory results for V. parahaemolyticus clonality parallel those reported for other bacteria, such as V. cholerae (19) and S. epidermidis (35), for which an epidemic population structure has been proposed. Therefore, we suggest that the V. parahaemolyticus population structure follows this "epidemic" model of clonal expansion in bacteria, where more adapted clonal complexes emerge from a background of highly recombinogenic bacteria (12). These clones then diversify predominantly by recombination rather than by point mutation. The paradox between clonality and high diversity should be resolved as more strains are added to the MLST database.
Two other clonal complexes were identified among the analyzed strains (CC36 and CC34). This is the first definitive demonstration of V. parahaemolyticus clonal complexes other than the pandemic clonal complex and further supports the epidemic model. CC2 has been linked almost exclusively to outbreaks associated with the consumption of raw oysters harvested from the U.S. Pacific coast since the 1970s (1). This clonal complex also included a 1998 clinical isolate from New York that was linked to oysters with an unknown harvest location (NY-3483, ST-36); it has been speculated that those oysters originated from the state of Washington (5). While CC36 has been geographically restricted, it is similar to the pandemic clonal complex in that it displays multiple serotypes (O4:K12 and O12:K12) within the ancestor ST (ST-36) and consists of at least six STs (if D6 is included into this clonal complex). Furthermore, one DLV of ST-36 isolated in 1982 (ST-21) is consistent with its earlier emergence (CC36) relative to the pandemic CC3. Finally, ST-21 is the earliest isolate in CC36 and was most closely related to recent ST-59 isolates from Alaska (2004). CC34 consisted of six isolates with four STs (ST-32, -33, -34, and -35) and two serotypes (O4:K8 and O4:K9). Five of the isolates were from oysters collected in Alabama, Louisiana, and Massachusetts; the remaining clinical isolate was from a California patient with unknown food consumption history. This clonal complex appears to be more diverse than either CC3 or CC36, and its association with human illnesses is less certain.
Consistent with the results of Chowdhury et al. (9), there does not appear to be a linkage between serotype and ST among pandemic strains, as four different serotypes were observed among the ST-3 isolates and the three SLVs in CC3 were all O3:K6. If, as hypothesized (31), seroconversion occurs by lateral transfer of genes that participate in the synthesis of either O or K antigens, it appears that these genetic acquisitions are independent and probably arise from donors other than those which might affect the HK genes analyzed in this study. Taken together and consistent with other studies (7, 28, 31), these findings support the conclusion that serotyping for V. parahaemolyticus need not be considered obligatory for further characterization of isolates and that serotyping may actually be misleading and of limited epidemiological value relative to molecular-based strain typing tools.
Analysis of the ME tree of the concatenated sequences of all loci showed that V. parahaemolyticus forms a very homogenous, well-supported group. However, three of the STs (ST-1, -2, and -62) formed a separate and distinct cluster outside the main group. This deviation from the main group appeared to be due to recombination with other, non-V. parahaemolyticus vibrios in some loci (e.g., gyrB, recA [ST-1], and dnaE). Analysis of individual trees for each locus suggests independent evolution of each gene and the importance of examining multiple genes for establishing phylogenetic relationships between V. parahaemolyticus strains (data not shown). A multiple gene sequence approach provides more detailed information on the genetic relationships among different V. parahaemolyticus strains because it allows for a buffering effect on the impact of lateral gene transfer such as was observed for strains that possessed ST-1, -2, and -62.
Overall, the data reported in this study indicate that V. parahaemolyticus is genetically diverse with a semiclonal population structure and that frequent recombination events seem to play an important role in the first steps of clonal diversification. Broader application of this MLST scheme will enhance understanding of the molecular epidemiology and evolution of this pathogen. This MLST scheme provides a universally available mechanism for timely recognition of evolutionary trends and emergence of V. parahaemolyticus clonal complexes, thus providing an early warning system. Prompt application of interventions (i.e., ballast water controls and harvest restrictions) could reduce public health consequences if new clonal complexes with enhanced virulence emerge or spread.
This study was supported by a grant from the U.S. Department of Agriculture, Cooperative State Research, Education and Extension Service, National Research Initiative, Competitive Grants Program, Epidemiological Approaches to Food Safety, project no. 2004-35212-14882.
Published ahead of print on 15 February 2008. ![]()
Supplemental material for this article may be found at http://jb.asm.org/. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»