Previous Article | Next Article ![]()
Journal of Bacteriology, May 2006, p. 3645-3653, Vol. 188, No. 10
0021-9193/06/$08.00+0 doi:10.1128/JB.188.10.3645-3653.2006
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
Department of Infectious & Tropical Diseases, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, United Kingdom,1 Bacterial Microarray Group, Medical Microbiology, Department of Cellular and Molecular Medicine, St George's, University of London, Cranmer Terrace, London SW17 0RE, United Kingdom2
Received 6 January 2006/ Accepted 12 February 2006
|
|
|---|
|
|
|---|
The incidence of Y. enterocolitica is apparently increasing worldwide (7). In parts of Europe, Y. enterocolitica rivals salmonellae as a cause of gastrointestinal disease (7, 8, 21). Y. enterocolitica can be isolated from a wide variety of sources, i.e., surface water; many food products, such as seafood and dairy products; and livestock, particularly pigs (7, 27, 28, 52, 58). This may be due to the consumption of uncooked and undercooked meat of porcine origin, such as chitterlings or tongue (7, 30, 38, 52). The transmission of Y. enterocolitica from pigs to humans is thought to occur through the contamination of pork products; however, the relationship between isolates from livestock and human disease is poorly understood.
A recent abattoir survey in the United Kingdom (1999 and 2000) revealed the fecal carriage rates of Y. enterocolitica in cattle, sheep, and pigs sent for slaughter as 6.3, 10.7, and 26.1%, respectively (43). The surprisingly high carriage rate of Y. enterocolitica in livestock was unexpected and suggests that this livestock may contribute to human yersiniosis.
Y. enterocolitica is classified into six biotypes (1A, 1B, 2, 3, 4, and 5) based upon varied biochemical properties. The six biotypes can be separated into three groups according to their lethality in a murine infection model which closely resembles the naturally acquired human infection: biotype 1A, which is considered to be mostly nonpathogenic (8); biotypes 2 to 5, which are low in pathogenicity; and biotype 1B, which is highly pathogenic (4, 14). Biotypes with low pathogenicity are generally isolated in Europe and Japan and are termed Old World strains. By contrast, highly pathogenic strains are most commonly isolated in North America and are termed New World strains.
Both the low- and high-pathogenicity groups carry a 70-kb plasmid that has been termed virulence plasmid pYV (29, 51, 61). This plasmid is usually absent from the nonpathogenic 1A biogroup. In addition, highly pathogenic group 1B also possess a high-pathogenicity island (HPI) (13). The importance of this gene cluster was demonstrated by transferring the HPI core genes of WA-C serotype O:8 biotype 1B to strain MRS40 O:9 biotype 2 of Y. enterocolitica (low pathogenicity), resulting in increased virulence (49). More recently, a novel virulence-associated type II secretion system unique to high-pathogenicity Y. enterocolitica strains has been described (36). However, it is unlikely that these genetic differences alone can explain the relative pathogenicity of the three groups.
Comparative genomic DNA (gDNA) microarray analysis has been used to investigate several pathogenic bacterial species in relation to pathogenesis and host specificity (17, 19, 20, 23, 32, 39). Microarray technology, allied to complex mathematical analysis to determine phylogeny, has provided a sensitive and robust method to examine the genetic relatedness of bacterial populations. The genetic relationships described by Bayesian phylogeny of a DNA-DNA microarray data set can then be correlated against the known phenotypes and ecological behavior of each bacterial strain in the analysis; this is particularly useful when studying the epidemiology and host association of pathogens (17). Comparison of strains isolated from different hosts, as well as pathogenic and nonpathogenic strains, can reveal predicted coding sequences (CDSs) that may be important for virulence, pathogen-host interactions, and transmission. For example, within the two other human pathogenic Yersinia species, microarray studies of Yersinia pseudotuberculosis revealed 11 DNA loci that were absent or highly divergent compared to the Yersinia pestis C092 genome (32). Acquisition of these loci may help to explain why Y. pestos can cause such a vastly different disease compared to its close enteropathogenic relative Y. pseudotuberculosis, from which it evolved an estimated 1,500 to 20,000 years ago (1).
In this study, we have carried out a whole-genome analysis of 94 isolates of Y. enterocolitica from human and livestock sources with a whole-genome microarray based on the recently sequenced genome of Y. enterocolitica 8081, biotype 1B serotype O8. The DNA microarray data have been combined with sensitive Bayesian method-based algorithms to gain new insight into the population structure of Y. enterocolitica, which revealed several new potential virulence factors, suggests livestock as an important source of human yersiniosis and provided basic information on the evolution of the species Y. enterocolitica.
|
|
|---|
Y. enterocolitica strains were cultured either in Luria-Bertani (LB) broth with constant shaking at 200 rpm or on solid medium prepared with LB agar. Incubation of both liquid cultures and agar plates was carried out at 28°C. All bacterial strains used in this study were stored in LB broth containing glycerol (15%) at 80°C. Bacterial chromosomal DNA was prepared with the Wizard gDNA purification kit (Promega). DNA samples were then analyzed by agarose gel electrophoresis and quantified with a GeneQuant spectrophotometer (Amersham). All gDNA was stored at 20°C in distilled H2O.
Microarray construction and hybridization. The Y. enterocolitica-specific microarray was designed to include 4,208 predicted CDSs from the Y. enterocolitica 8081 chromosome and 83 from plasmid pYV. PCR products were designed by the approach described by Hinds et al. (33). Ten PCR products for each CDS were designed with Primer3 (56), and then, based on BLAST analysis, the optimum PCR product to represent each CDS was selected. PCR primers were synthesized by MWG Biotech (Ebersberg, Germany), and high-throughput PCR amplification was carried out with a liquid-handling and PCR amplification robot (RoboAmp 9600; MWG Biotech). PCR products were analyzed by agarose gel electrophoresis to ensure a unique band of the correct size and 5% of the products were confirmed by DNA sequencing. Microarrays were constructed by robotic spotting of the PCR products in duplicate on UltraGaps amino-silane-coated glass slides (Corning) with a MicroGrid II (BioRobotics). Further details of microarray construction methodology can be found in reference 34.
All of the strains in the collection were competitively hybridized with the Y. enterocolitica 8081 1B microarray, which had duplicate spots of the 4,291 CDSs. Hybridizations were performed by modifying a protocol previously described by Dorrell et al. (20). Test gDNA was labeled with Cy5-dCTP, and Cy3-dCTP-labeled Y. enterocolitica 8081 gDNA was used as a common reference for all hybridizations. UltraGap slides were prehybridized in 3.5x SSC (1x SSC is 0.15 M NaCl plus 0.015 M sodium citrate)-0.1% sodium dodecyl sulfate (SDS)-10 mg/ml bovine serum albumin for at least 20 min at 65°C. After prehybridization, the slides were rinsed in distilled water for 1 min and then in isopropanol for 1 min. The slides were dried by centrifugation at 1,200 rpm for 5 min and stored in the dark until ready for use. Control Cy3-labeled and test Cy5-labeled gDNA samples were mixed together and purified with QIAGEN PCR purification columns. The mixed control and test sample was eluted from the column with 71.5 µl of distilled H2O. The labeled DNA was mixed with hybridization solution at a final concentration of 4x SSC-0.3% SDS. The sample was then denatured and added to a prehybridized microarray slide. Two Lifter Slips (22 by 22 mm; Eyrie Scientific) were placed onto each array before sealing in a humidified hybridization chamber (Telechem International). The chambers were immersed in a water bath at 65°C for 16 to 20 h. The slides were washed in 400 ml of 1x SSC-0.06% SDS at 65°C for 2 min, followed by two separate washes in 400 ml of 0.06% SSC at room temperature. The microarray slides were then scanned with a GMS 418 Scanner (Genetic Microsystems). Spot fluorescence intensities were acquired with ImaGene 5.5 (BioDiscovery Inc.).
Microarray data analysis and comparative phylogenomics. Whole-genome comparisons were carried out with GeneSpring v6.1 (Silicon Genetics). Samples were normalized in GeneSpring by the following criteria. Intensity values below 0.01 were set to 0.01, and a ratio was calculated by dividing the signal channel intensity by the control channel intensity for each gene in each sample; if the control channel was below 0.01, then 0.01 was used instead, and if the control channel and the signal channel were both below 0.01, then no data were reported. Each sample was normalized by dividing each measurement by the 50th percentile of all measurements in that sample with only genes flagged present or marginal by ImaGene to calculate the percentile. Normalized raw and control data for each array were exported from GeneSpring, transformed into log ratio data, and analyzed with GACK software to determine whether genes were (i) present or (ii) absent or highly divergent (40). This software determines dynamic cutoffs for each array individually. The default GACK settings for trinary analysis were used (data histogram bin size, 0.10; data smoothing, none; peak modeling, normal curve; trinary output, trinary %EPP cutoff 1 = 0; trinary %EPP cutoff 2 = 100). The output of GACK was transformed into NEXUS format, and the relationship of the strains was determined with Bayesian method-based algorithms implemented through Mr Bayes v3.0 software (53) as described in reference 17. Mr Bayes requires data to be in a binary format, so any genes designated as marginal were attributed to the present category for further analysis.
All analytical procedures were identical to those described by Champion et al. (17). In summary, microarray data analysis was undertaken by Bayesian methods performed with a Metropolis-coupled Markov chain Monte Carlo (Mr Bayes (35) with a 16-category gamma distribution to model the presence or absence rate heterogeneity per gene throughout the genome of the strain. A four-chain Metropolis-coupled Markov chain Monte Carlo was performed for 1.1 x 106 generations, and a burn in of 0.1 x 106 generations was used.
The resulting tree was viewed by TREEVIEW (http://taxonomy.zoology.gla.ac.uk/rod/treeview.html). In addition, for phylogenetic analysis, if 5% of the CDSs were flagged (poor-quality spot) in a strain, then the strain was removed. If 10% of the strains had a CDS flagged, the CDS was removed. By the 5% criterion, strains Y14/02 (cattle, biotype 1A serotype O6,30) and Y13/02 (cattle, biotype 1A serotype O6,30) were not analyzed; therefore, 92 strains were included in the phylogenomic analysis.
Identification of genes contributing to clusters. To identify genes which contributed to the generation of the pathogenic and nonpathogenic clusters, MacClade 4 analysis of phylogeny and character evolution software was used (42). The NEXUS file for all strains and the one-millionth tree from Mr Bayes were loaded into MacClade. The tree was then rooted with all of the 1B strains. Characters were then traced with the "trace all changes calculations" set to unambiguous changes only (minimum number changes). If a gene was absent or highly divergent in a particular strain, then the branch representing that strain was colored yellow. If the gene was present, then the branch was colored blue.
Nucleotide sequence accession numbers. Fully annotated microarray data have been deposited in BµG@Sbase (accession number E-BUGS-36; http://bugs.sgul.ac.uk/E-BUGS-36) and also ArrayExpress (accession number E-BUGS-36).
|
|
|---|
Core set of genes present in all test strains identified. Genomic comparisons of 94 isolates were used to calculate the minimal core gene set. This was achieved by calculating the total number of genes that had a GACK score of "present" in every strain and control strain Y. enterocolitica 8081 1B. The minimal core gene set for Y. enterocolitica was 894 CDSs; this low core gene value validates that we have sampled a diverse collection of strains. The core gene set represents a surprisingly low value of 20.8% of the total genome (Fig. 1), suggesting that the pangenome of Y. enterocolitica is vast and that the size of the variable component dwarfs that of the core set of genes. By contrast, in similar sample size studies of C. jejuni and S. aureus the reported core set values were 59.2% and 78%, respectively (17, 25). As expected, many of the functional categories that are involved in essential housekeeping functions, such as DNA and RNA metabolism, protein processing and secretion, cell structure, cellular processes, and energetic and intermediary metabolism, were represented in the core gene set. Many of the accessory genes composed of CDSs likely to have been acquired by lateral gene transfer, such as transposons, insertion sequences, phages, and pathogenicity islands.
![]() View larger version (44K): [in a new window] |
FIG. 1. Venn diagrams showing core gene sets calculated with CDSs that were classified as present by GACK analysis. A core gene set of 894 CDSs for Y. enterocolitica is shown in white. Core gene sets for high-pathogenicity (path), low-pathogenicity, and nonpathogenic Y. enterocolitica strains are shown in the red (2,997 CDSs), green (1,464 CDSs), and dark blue (1,362 CDSs) circles with overlapping regions, respectively. This excludes 1,105 chromosomal and pYV CDSs that are not core to any of the biogroups.
|
Comparative phylogenomic analysis. Phylogenomic analysis was carried out with microarray data generated from a strain collection consisting of 35 human isolates, 35 pig isolates, 15 sheep isolates, and 7 bovine isolates. Phylogenomic analysis was initially carried out with all 4,208 chromosomal CDSs of Y. enterocolitica 8081 and 83 CDSs from virulence plasmid pYV. This resulted in division of the low-pathogenicity clade into further clades containing either strains that possessed pYV or those which had lost pYV. Because pYV can be readily lost during laboratory passage, the CDSs present in pYV were excluded from the subsequent analysis (8). These data showed that by Bayesian method-based phylogeny the Y. enterocolitica isolates fell into three distinct clades, high pathogenicity, low pathogenicity, and nonpathogenic, one lineage supported by Bayesian probabilities (P = 1.0) (shown in Fig. 2). These distinct clades formed irrespective of pYV, thus confirming that the presence of this plasmid is not the sole discriminatory factor for virulence in this species. These results also confirmed that the traditional biotyping is a useful method for distinguishing strains of Y. enterocolitica.
![]() View larger version (21K): [in a new window] |
FIG. 2. Bayesian phylogenomic relationships of all 92 strains following DNA-DNA microarray hybridization against the genome strain (*). Numerical values represent the probability (P) of support for each internal branch, where only values above 0.95 are considered robust. Serotypes and biotypes are designated ST and BT, respectively. A more detailed analysis will be presented elsewhere.
|
The strains in the low-pathogenicity clade contained 51% of the strains analyzed (47/92) and were all of pathogenic biotype 2, 3, or 4. These strains had been isolated from either livestock (cattle, pigs, and sheep) sent for slaughter for human consumption or were human isolates from patients presenting to their GPs with diarrhea or collected during the IID study. Within the low-pathogenicity clade, the isolates have partially separated according to biotype and serotype. Twenty-five isolates comprising 12 biotype 3 serotype O:5,27 strains, 4 biotype 3 unserotypeable strains, 3 biotype 3 serotype O:5 strains, 2 biotype 3 serotype O-rough strains, 2 biotype 4 serotype O:5,27 strains, and 2 biotype 4 serotype O:3 strains clustered together. Nine isolates comprising one biotype 3 serotype O:5 strain and eight biotype 4 serotype O:3 strains clustered together. Thirteen isolates which were slightly more heterogeneous, comprising 10 biotype 3 serotype O:9 strains, 2 biotype 2 serotype O:9 strains, and 1 biotype 3 unserotypeable strain clustered together.
The third major clade that formed was the nonpathogenic clade. This clade contained 40.3% of the strains tested (35/92); 94.6% of these strains were biotype 1A strains. These strains were isolated from either livestock (cattle, pigs, and sheep) sent for slaughter for human consumption or were human clinical isolates from patients presenting to their GPs with diarrhea or collected during the IID study. Two strains of pathogenic biotypes Y213/02 (biotype 4 serotype O:3) and Y21/03 (biotype 1B serotype O:19) fell within this clade. These two isolates have been retyped and their biotypes confirmed; in addition, it was also noted that strain 21/03 was completely noninvasive in tissue culture (A. McNally, personal communication). Thus, this analysis has highlighted strains of particular interest that warrant further investigation. The distribution of isolates within the nonpathogenic clade was more heterogeneous than in the low-pathogenicity clade. As with the low-pathogenicity strains, human and livestock isolates were distributed throughout the clade and did not cluster according to source (host).
Although Y. enterocolitica 1A strains are traditionally considered to be nonpathogenic, three of the human isolates were recovered from patients presenting to their GPs with diarrhea or vomiting. Biotype 1A strains are generally thought to be avirulent because of the lack of virulence plasmid pYV and classical virulence determinants such as attachment-and-invasion locus (ail), myf, and ystA genes and a functional inv gene, which is typical of invasive isolates (45, 50). However, there is growing epidemiological evidence suggesting that 1A strains can cause disease. Y. enterocolitica biotype 1A strains have been isolated from patients presenting with gastrointestinal illness in Australia, New Zealand, South Africa, Chile, Switzerland, Canada, and the United States (6, 11, 47). Direct comparisons of a clinical 1A isolate against an environmental 1A isolate by subtractive hybridization identified 54 sequences that were present in the clinical isolate but absent from the environmental isolate (59). Genetic differences between asymptomatic and case isolates in this study could not be found. It may be that host immune factors play a role in the clinical outcome of infection rather than genetic differences between isolates. Together, this evidence suggests that additional analysis of biotype 1A isolates from the environment, livestock, and clinical settings is required before further assumptions about the pathogenicity of this biotype can be made. The Y. enterocolitica 8081 microarray represents more than 4,200 CDSs; however, one disadvantage of using single-genome microarrays is that it is only possible to detect CDSs that are present in the strain used to make the array.
Genes present exclusively in American highly pathogenic (biotype 1B) strains. CDSs that were present in all eight highly pathogenic biotype 1B strains were identified with CDSs that had been designated as present by GACK analysis. GACK analysis designates CDSs present, marginal, or absent without the use of defined arbitrary cutoffs. These CDSs were present in all eight highly pathogenic U.S. strains and absent from or highly divergent in all other isolates. One hundred twenty-five chromosomal CDSs were identified by this method as detailed elsewhere (http://bugs.sgul.ac.uk/E-BUGS-36). Biotype 1B strains are highly pathogenic in mice, and genetic regions have previously been identified which have been shown to be unique to this biogroup. The list of 125 CDSs included regions of the previously characterized HPI (YE2611 to YE2622) (12, 13), chromosomal type III secretion system Ysa (YE3536 to YE3561) (26, 31), and a recently identified type II secretion system, Yts1 (YE3562 to YE3579) (36), all of which are unique to highly pathogenic biotype 1B strains. These results validate our microarray methodology and analysis. In addition, several of the 125 CDSs identified were insertion elements, phage-related proteins, and 32 hypothetical proteins (representing 40% of the total number of CDSs). Of the remaining high-pathogenicity-specific CDSs, seven were highlighted as particularly noteworthy. BLASTX (3) analysis of the seven previously uncharacterized proteins revealed that these CDSs had highly significant similarities to other bacterial virulence determinants and thus may contribute to the virulence of biotype 1B strains (Table 1). YE0126 showed amino acid similarity to hemophores present in Y. pestos, Y. pseudotuberculosis, and Erwinia carotovora (16); YE0344 was similar to the MceH protein in Klebsiella pneumoniae (16, 41); YE2408 was similar to hemolysin activator proteins from many other bacteria (16); YE4052 was similar to metalloproteases found in several other bacteria; and YE4088, which is encoded by a pseudogene in Y. pestos, was similar to sensor kinase proteins found in many other bacterial species (16). YE2447 showed similarity to OspG, a protein which is secreted by the Mxi-Spa type III secretion machinery in Shigella flexneri (10, 37), and YE3614 was similar to a probable SPI2 translocated effector protein found in Chromobacterium violaceum and phospholipases found in many bacteria (9). Effector proteins have been shown in many bacteria, including yersiniae, to be important in bacterial virulence. YopE and YopH are intracellular effectors encoded by virulence plasmid pYV of Y. enterocolitica. YopH is a phosphotyrosine phosphatase which is thought to protect Y. enterocolitica from phagocytosis, contribute to inhibition of cytokines produced by T cells, and prevent the ability of B cells to upregulate surface expression of the costimulatory molecule B7.2. YopE disrupts actin filaments (54) by depolymerization of actin stress fibers via activation of Rho GTPase (55). We believe that these additional CDSs could help to explain why biotype 1B strains exhibit high virulence in mice and warrant further investigation.
|
View this table: [in a new window] |
TABLE 1. Selected CDSs that are present in all 1B biotype strains and absent from all other isolates
|
|
View this table: [in a new window] |
TABLE 2. CDSs present in only all biotype 2 to 4 strains and control strain Y. enterocolitica 8081 1B
|
![]() View larger version (20K): [in a new window] |
FIG. 3. Distribution of YE1820 among Y. enterocolitica strains. A parsimony-based gene analysis for determining the distribution of individual CDS YE1820 throughout the phylogenetic tree is shown. Strains from which YE1820 is absent are yellow, and strains in which YE1820 is present are blue. Strains in the low-pathogenicity clade and the high-pathogenicity clade all contain YE1820. YE1820 is absent from strains in the nonpathogenic clade.
|
![]() View larger version (16K): [in a new window] |
FIG. 4. Schematic of CDSs up- and downstream of ail in Y. enterocolitica 8081. The CDSs are written below with arrows denoting the direction of transcription. CDSs filled with hatched lines were absent from all 1A isolates. CDSs with asterisks were present in all biotype 2 to 4 isolates. The majority of CDSs from YE1799 to YE1827 were absent from both biotype 1A and 2 to 4 isolates.
|
|
View this table: [in a new window] |
TABLE 3. CDSs present only in all biotype 1A strains and control strain Y. enterocolitica 8081 1B
|
![]() View larger version (19K): [in a new window] |
FIG. 5. Distribution of YE0904 among Y. enterocolitica strains. A parsimony-based gene analysis for determining the distribution of individual CDS YE0904 throughout the phylogenetic tree is shown. Strains from which YE0904 is absent are yellow, and strains in which YE0904 is present are blue. Strains in the nonpathogenic clade and the high-pathogenicity clade all contain YE0904. YE0904 is absent from strains in the low-pathogenicity clade.
|
Evolution of Y. enterocolitica. In evolutionary terms, Y. enterocolitica is considered to be distantly related to Y. pseudotuberculosis and Y. pestos. Indeed, it has been suggested that Y. enterocolitica is as closely related to the other pathogenic yersiniae as E. coli is to Salmonella species (48). The data obtained from microarray analysis were used to generate a phylogenetic tree based on a Bayesian method-based algorithm incorporating a gamma distribution to model rate heterogeneity across the genome. Each horizontal line on this tree represents the average number of gains or losses of a gene per gene of the genome strain; thus, the genetic distances among the strains in the three clades can be measured. Determining the exact evolutionary order of speciation or subspeciation for the three clades of Y. enterocolitica is not possible without an appropriate outgroup (a species which unequivocally evolved before the Y. enterocolitica complex). However, a brief analysis shows that low-pathogenicity and nonpathogenic strains were the most closely related genetically (approximately 0.19 gain or loss of a gene per gene) and the nonpathogenic clade was more closely related genetically to the highly pathogenic clade than the low-pathogenicity strains (approximately 0.33 compared to 0.37 gain or loss of a gene per gene). With the assumption that all genetic distances should be reasonably equidistant from the common ancestor (a molecular clock type of assumption), the highly pathogenic clade is a direct descendant of the most ancient common ancestor, and this could imply that the formation of mildly pathogenic and nonpathogenic strains resulted from a biogeographic movement between the New World and the Old World. The alternative rooting is presented herein and provides the most parsimonious evolutionary explanations, where the common ancestor is slightly more compatible with an Old World origin and is also the progenitor of the virulence plasmid, which would then be lost on a single occasion in the ancestor to the nonpathogenic and highly pathogenic clades.
In either scenario, a pathogenic phenotype could be present in the ancestral Y. enterocolitica strain where pathogenic determinants, for example, ail, are retained in both highly pathogenic and mildly pathogenic clades but lost in the formation of the nonpathogenic clade. Furthermore, the cumulative loss of both the virulence plasmid and the pathogenicity islands would then have resulted in the nonpathogenic strain phenotype. Alternatively, islands of pathogenicity could have been acquired independently through horizontal transmission and the large number of phage-related proteins surrounding ail for example, which were revealed during the sequencing of the Y. enterocolitica 8081 1B strain. The order of evolution of these three Y. enterocolitica clades is important because it provides a model for the evolution of pathogenicity and how over time an organism may become more or less virulent. Biotype 1B strains are termed New World strains as they are rarely reported outside of the United States. It is possible to speculate that, because of a shift in geographical location, the biotype 1A-1B intermediary was able to acquire virulence determinants from its new environment, resulting in the highly pathogenic 1B biotype, which is genetically much more diverse in comparison with the other biotypes. The split between 1A and 1B strains appears to be ancient, and is unlikely to be due to recent human migratory movements to North America and possibly is more likely due to ancient geographical land mass movements.
Our results provide the first detailed whole-genome comparison of Y. enterocolitica. A microarray was constructed with duplicate reporter elements representing all chromosomal and plasmid-predicted (4,291) CDSs of sequenced strain Y. enterocolitica 8081 1B. DNA microarray analysis of 94 biotype 1A, 1B, and 2 to 4 strains revealed extensive genome diversity within Y. enterocolitica species. The core minimal gene set in Y. enterocolitica was 894 CDSs. Bayesian method-based algorithms were then used alongside tracing of character evolution to reveal the distinct population structure of the 94 Y. enterocolitica strains. Based upon DNA-DNA hybridizations and 16S rRNA data, Neubauer et al. proposed the division of Y. enterocolitica into two subspecies, Y. enterocolitica subsp. enterocolitica for strains of American origin and Y. enterocolitica subsp. palearctica for strains of European origin (46). However, given the small core genome, unequivocal distinction between clades and the genetic distance between them, we believe that our data confirm that Y. enterocolitica is a highly heterogeneous species and adds weight to the case for the existence of three subspecies.
This method has also allowed a large proportion of CDSs to be identified that contribute to the formation of each clade and thereby identifying several new potential virulence determinants, which may help to explain the differences in pathogenicity observed in the various biotypes of this species. The approach described in this study provides a methodological prototype of robust phylogenomics that should be applicable to the study of other microbes.
This work was funded by the Department for Environment, Food and Rural Affairs, United Kingdom.
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»