Previous Article | Next Article ![]()
Journal of Bacteriology, January 2003, p. 553-563, Vol. 185, No. 2
0021-9193/03/$08.00+0 DOI: 10.1128/JB.185.2.553-563.2003
Copyright © 2003, American Society for Microbiology. All Rights Reserved.
Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, California 94305-5124,1 Department of Biological Sciences, Centre for Molecular Microbiology and Infection Imperial College of Science, Technology, and Medicine, Kensington, London SW7 2AY, United Kingdom2
Received 21 June 2002/ Accepted 14 October 2002
|
|
|---|
|
|
|---|
More than 60% of all Salmonella strains identified and 99% of the serovars responsible for disease in warm-blooded animals are members of subspecies I. The other Salmonella subspecies, in particular subspecies IIIa (Arizona) and S. bongori, are associated with disease in cold-blooded organisms with Arizona and are occasionally responsible for systemic disease in humans. What is particularly intriguing about subspecies I serovars is that their ability to cause disease in animals encompasses a spectrum of host specificity and disease severity. For example, serovar Typhi causes a systemic disease (typhoid) only in humans and higher primates, whereas serovar Enteritidis produces a self-limiting gastrointestinal disease in many different animals. Serovar Typhimurium causes a gastrointestinal disease in a wide variety of animals and yet is also responsible for a typhoid-like disease in the mouse. In addition, specific isolates have been been found in cases of severe disease in pigeons (42). One of the prevailing questions in Salmonella research today concerns the identification of genetic factors that confer upon these highly related serovars their ability to colonize, and in some cases to cause disease in, a wide variety of animal hosts.
The release of two S. enterica sequences, the imminent completion of six other serovars and strains, and the recent funding to sequence additional serovars and strains (http://www.sanger.ac.uk/Projects/Salmonella/) has initiated a new era of comparative genomics in Salmonella biology (14). This sequence information will provide a valuable resource from which we can begin to dissect the features of Salmonella that are both shared and distinct between serovars and to start exploring how and why differences arose. However, sequencing is still a laborious and expensive technique, making it difficult to obtain answers concerning the genetic composition of serovars, strains, or newly emerged variants of interest in a timely manner. DNA microarray technology provides a useful adjunct to current techniques for the assessment of differences and changes in bacterial genetic content. Indeed, this approach has already been utilized in a variety of bacteria to probe for differences between clinical isolates, vaccine strains, species diversity, and disease endemicity (reviewed in references 22 and 27).
We used a Typhimurium spotted DNA microarray to compare the genomes of a number of Salmonella serovars in order to clarify their genetic relationship and identify features that may serve to profile the serovars and that may correspond to host range and disease. Twenty-four strains of 12 S. enterica serovars and two S. bongori strains were analyzed. We found general agreement with previously published MLEE-based observations but also describe here a more distant relationship of Arizona from subspecies I than was previously determined, as well as identify genetic features that suggest a common origin for the human systemic disease-associated serovars Typhi, Paratyphi A, and Sendai.
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. Strain information and the percentage of SL1344 genes present for each strain
|
Hybridizations. Genomic DNA probe preparation was performed essentially as previously described (46), except that 1.5 µg of genomic DNA and one-eighth of one reaction vial of FluoroLink Cy3 or Cy5 monofunctional dye (Amersham) was used per reaction. SL1344 genomic DNA was the reference DNA for all hybridizations and labeled with Cy3, whereas sample DNAs were labeled with Cy5. Multiple hybridizations were performed for the majority of strains analyzed to reduce the effects of variation in array quality. The separate labeling reactions were pooled after each respective Cy dye incorporation step and then again divided into aliquots to minimize inconsistencies in probe generation. The probes were resuspended in 18 µl of Tris-EDTA (TE), 2 µl of 20 mg of yeast tRNA/ml, 4.25 µl of 20x SSC (1x SSC is 0.15 M NaCl plus 0.015 M sodium citrate; 3.4x, final concentration), and 0.75 µl of 10% sodium dodecyl sulfate (0.3%) and then denatured for 2 min at 99°C and centrifuged briefly at 13,800 x g. Probes were hybridized to the array at 50°C for 16 h and washed as described previously (15). Arrays were scanned with a GenePix 4000A scanner (Axon Instruments, Redwood City, Calif.) and processed by using GenePix Pro 3.0. All raw datasets are available from the Stanford Microarray Database (http://genome-www5.stanford.edu/MicroArray/SMD/) (53).
Data analysis.
Normalized data from all 64 hybridizations were filtered for spot quality (Cy3 net mean intensity of
350) and downloaded from the Stanford Microarray Database according to their mean log2 Cy5/Cy3 (logRAT2N) ratios for analysis. In addition, spots that gave invalid results for more than 20% of the strains were removed. This allowed for the retrieval of 4,494 spots. The data set was additionally filtered to remove spots whose serovar Typhi CT18 hybridization results were inconsistent with what was expected (see below) and ultimately yielded 4,122 spots corresponding to 3,353 annotated ORFs (73% of the LT2 genome) and 20 intergenic regions. The word "gene" will be used throughout in reference to the ORF that each spot corresponds to unless otherwise specified. The multiple arrays for each serovar or strain were averaged across the datasets. This and all subsequent data analysis were done by using Microsoft Excel and a microarray genomic analysis program called GACK (27a). Briefly, this program is capable of dynamically generating cutoffs for present/absent (conserved/divergent) gene analysis for each array hybridization and functions independently of any normalization process that would otherwise be strongly influenced by differences between the reference strain and serovar of interest. A user manual for the GACK program has been provided (http://falkow.stanford.edu/whatwedo/software).
The hybridization data for the three CT18 arrays were downloaded, filtered by using the same parameters indicated above except that a 66% good data filter was applied to ensure at least two datum points for each spot, averaged across the three arrays, and analyzed by using GACK. The percent similarity for the highest high-scoring profile of each amplicon in CT18 was obtained by using WU-BLAST. This percent similarity was compared to the averaged logRAT2N of the hybridization data. Of the 4,712 spots that were retrieved, 85.4% of the data set gave unequivocal results. A total of 687 (14.6% of the data set) had hybridization results contrary to what was expected from the percent similarity analysis. A total of 250 of these were false positives, whereas 160 of were false negatives. This list of 410 spots (8.7% of the data set) was used as an additional filter for the data (see above). Another 277 spots either had percent similarities in the uncertain/slightly divergent range or yielded hybridization results such that they could not be assigned as present or absent and so were not included in the filter.
Genome order analysis was performed by organizing the spots for the entire data set (averaged and processed with GACK) in their genome order according to the LT2 annotation and viewed with TREEVIEW (16). Clustering was performed by using the Pearson correlation, noncentered metric algorithm of XCLUSTER (Gavin Sherlock; http://genome-www.stanford.edu/
sherlock/cluster.html). Clustered datasets were viewed in TREEVIEW.
Core genes analysis was performed by applying the GACK program to the data set and analyzing the output in Excel to identify spots that were present across all of the serovars and strains. The percent present analysis to analyze the percentage of SL1344 genes that is shared by each serovar was calculated by taking the data set above and determining the number of absent, missing, and present spots for each serovar.
Final datasets have been made available (http://falkow.stanford.edu/whatwedo/supplementarydata/) in a format compatible for viewing with TREEVIEW as noted in the text.
|
|
|---|
93% identity to the amplicon (87% of the data set) would be detected as present/conserved on our array, whereas genes that diverge from more than 77% identity would be assigned as absent/divergent (11% of the data set). Genes that share 92 to 78% identity with the amplicon were classified as uncertain/slightly divergent (2% of the data set). In addition, a percent present analysis was performed on CT18 by using the complete serovar data set (see Materials and Methods and below) and revealed that 91.5% of SL1344 is shared with CT18 (Table 1), a level comparable to the 88% observed in DNA hybridizations involving serovar Typhi strains 643 and LT2 (13), the 89% seen upon direct comparison of the CT18 and LT2 genome sequences (35), and the 90% seen with CT18 hybridized on an LT2 microarray (40). Genome order analysis reveals discrete regions of variability between serovars and strains. Salmonella possesses a large amount of horizontally acquired genetic information in the form of prophages and pathogenicity islands (distinguished by their absence in Escherichia coli), within many of which lie genes that have been shown to play a role in pathogenesis (19, 21, 23, 25, 34, 35, 37, 39). It has also been observed that a large proportion of the Typhimurium genome consists of regions of genes with related functions, contributing to the proposal that many horizontally acquired genes are more stably maintained in bacteria when genes of related function are transferred along with them (28). We were therefore interested in seeing whether or not discrete patterns of genes could be identified by arranging our microarray data with respect to the LT2 annotated gene order. This organization of data highlights distinct regions where multiple contiguous genes share the same hybridization pattern regardless of possible translocation or inversion of the region.
An overview of the Salmonella serovar DNA hybridizations to the SL1344 DNA microarray presented in LT2 gene order is provided in Fig. 1A. It is immediately apparent is that the pSLT Typhimurium virulence plasmid, which is required for systemic disease in the mouse, is predominantly missing from all non-Typhimurium serovars except for serovars Dublin and Paratyphi C (Fig. 1B). It has been demonstrated by Boyd and Hartl (8) by Southern hybridization that the spv gene cluster in pSLT associated with systemic disease is distributed throughout subspecies I and, when present, is always found in a virulence plasmid. In addition, these authors demonstrated that the spv cluster is also found in subspecies II, IIIa, IV, and VIII, but in these instances it is always chromosomally located. The observation that the plasmid and spv region are present in serovars Dublin and Paratyphi C and absent in S. bongori corresponds with what has been previously observed, and comparison of the spv amplicon sequences to the Paratyphi A and Typhi genome sequences confirms that the cluster is absent in these serovars as well (data not shown). However, whereas this previous work examining the same serovar Enteritidis, Pullorum, and Choleraesuis SARB strains as in our study indicates that all possess the spv cluster, our data show that there is some heterogeneity within the cluster, as well as throughout the plasmid, for these serovars. One explanation for the differences between the two data is that, in the Southern hybridizations, a probe against the entire spv cluster was used, whereas our work reflects hybridizations to short, gene-specific amplicons. Interestingly, all serovars appear to be missing or have divergent portions of the virulence plasmid outside of the spv cluster. The array hybridization pattern for Arizona indicates that it possesses the spv locus but is missing the majority of the pSLT plasmid, a finding consistent with previous observations regarding the chromosomal location of spv in this serovar (8). However, our data indicate that there is some variability in the cluster. This is supported by a finding that, although spvRBC of an Arizona clinical isolate are conserved with respect to subspecies I, spvD was absent and spvA possessed a frameshift mutation at the C terminus of the gene that results in a larger protein (8, 33).
![]() View larger version (121K): [in a new window] |
FIG.1. Genome order analysis of the serovar microarray data. Multiple arrays for each serovar and strain have been averaged, analyzed with GACK, and organized with respect to the LT2 genome order. Each row corresponds to a specific spot on the array, whereas columns represent strains analyzed and are labeled according to the designations in Table 1. The color scheme is located at the bottom of the figure, with the brightest yellow corresponding to spots that are absent/divergent with high certainty, the brightest blue indicating spots that are present/conserved with the greatest certainty, black indicating spots are uncertain or slightly divergent, and gray indicating missing data. (A) The entire data set of 4,122 spots. Indicated are the pSLT virulence plasmid, the SPI-1 and SPI-2 pathogenicity islands, the Stf and Lpf fimbrial operons, and the Fels and Gifsy prophages. Enlargement of the regions corresponding to the pSLT virulence plasmid (B), the SPI-2 pathogenicity island (C), and the SPI-1 pathogenicity island (D) are also shown. Specified are the annotated genes within each region, where vertical bars indicate multiple spots on the array that correspond to the same gene. Not indicated are putative genes, unannotated ORFs, and intergenic regions. This data set is available online (http://falkow.stanford.edu/whatwedo/supplementarydata/, Appendix 1).
|
In addition to these previously characterized regions, there were multiple variable regions throughout the genome that are composed of uncharacterized putative proteins. These regions are potentially of great interest since they represent clusters of genes that may have a serovar-specific association. For example, the region including genes STM4488 to STM4497 houses a putative type II restriction enzyme that is only present in serovar Typhimurium. The region from STM4418 to STM4436 includes sugar transporters, putative endonucleases, and putative cytoplasmic proteins that are present only in serovars Typhimurium and Paratyphi B. STM4258 to STM4264 represent a region of putative genes absent only in Arizona.
Another interesting general observation that can be made from Fig. 1A is the large number of genes missing in the S. bongori and Arizona serovar hybridizations that correlates with the current organization of these two serovars into different subspecies. A major difference between these two serovars (subspecies V and IIIa) and the other serovars (subspecies I) is emphasized in the SPI-2 region (Fig. 1C). We observed here that S. bongori does not hybridize to the majority of the SPI-2 spots, whereas Arizona has a heterogeneous pattern of hybridization, indicating an appreciable degree of sequence variability. This could indicate that the SPI-2 island in Arizona has undergone some modification, possibly consistent with its preferred niche in cold-blooded animals. Previous work looking at the distribution of these pathogenicity islands by using the SARC set demonstrated the absence of SPI-2 in S. bongori but its presence across the other subspecies, including Arizona (36). Since the probe used in the Southern hybridizations spanned the entire SPI-2 region, the sensitivity to detect changes in individual genes is lower than to our ability to probe on a gene-by-gene basis. In contrast, the microarray hybridization pattern for SPI-1 (Fig. 2D ) indicates that it is present across the serovars, a finding consistent with what has been previously shown (36). One SPI-1 gene, avrA, is absent only in the human-specific serovars and S. bongori (Fig. 2D). This gene has been previously shown to encode an effector molecule secreted by the SPI-1 system that bears protein sequence similarity to the Yersinia pseudotuberculosis YopJ protein and the plant pathogen Xanthomonas campestris pv. Vesicatoria-secreted avirulence protein AvrRxv, which is thought to play a role in determining plant host range (26) (see below). With the exception of Sendai (which is associated with enteric fever in humans), the hybridization results show the same distribution for avrA as predicted from Southern hybridization studies (41). Three other SPI-1 genessipA, sptP, and sipBgave microarray hybridization patterns that corresponded with that expected in the SARB set (41).
![]() View larger version (100K): [in a new window] |
FIG.2. Hierarchical cluster analysis of the microarray data by XCLUSTER. (A) Clustering of the entire data set by both serovar and gene. Shown at the top is the unrooted tree for the relationship of the serovars. (B) Clustering of the data set is as described in panel A except that the 2,244 core genes have been removed. (C) Enlargement of the tree generated in panel B. Color scheme is as described in Fig. 1. The data set in panel B is available elsewhere (http://falkow.stanford.edu/whatwedo/supplementarydata/, Appendix 5).
|
Measuring serovar relatedness with respect to SL1344 reveals that Arizona is more distantly related than currently thought. Previous DNA reassociation experiments indicated that subspecies I serovars shared between 85 and 100% of their genetic information with the reference LT2 strain and that Arizona (subspecies IIIa) shared on the order of 70 to 80%, indicating that it is highly related to all of the other salmonellae but is genetically distinct. A third class of "atypical" Salmonella (from subspecies II and IV) had DNA-DNA association ranges that fell between the two classes (13). We were therefore interested in assessing whether or not the microarray data would give us a measure of the relatedness of Salmonella serovars to each other.
Based on an analysis of the microarray data, we estimated the percentage of SL1344 genes contained within each serovar (Table 1). A total of 89 to 100% of SL1344 was shared with the subspecies I serovars and >99% with all other serovar Typhimurium strains. The percentages for serovars Typhi (90.4 to 91.5%), Paratyphi A (89.2 to 89.8%), and Paratyphi B (92.2 to 93.8%) are comparable to genomic comparison values reported for an LT2 microarray and the comparison of annotated sequences (35, 40). A total of 73.5 to 77.5% of the SL1344 genome hybridized with Arizona, with the clinical isolate 5705A sharing the smallest number of genes. These values are consistent with the percent similarities reported in the DNA reassociation experiments (13). Interestingly, SL1344 shares 83% of its genetic information with both S. bongori strains analyzed, placing it in an intermediate range between Arizona and subspecies I. This result indicates that Arizona is the most divergent of the serovars analyzed in the present study, whereas S. bongori assumes a more intermediate degree of difference. A slightly lower percent present value has been previously reported for Arizona relative to S. bongori (83 and 85%, respectively) (35, 40) but not to the same extent as we observe here. MLEE analysis has implicated S. bongori as the most divergent of the Salmonella (10, 44), and the lack of an SPI-2 pathogenicity island indicates that it may have diverged before S. enterica acquired SPI-2 (3). However, the possibility exists that S. bongori originally possessed the island but eliminated it earlier in its evolution due to its restricted niche in cold-blooded animals. Arizona, on the other hand, although considered a pathogen of reptiles, can cause severe disease in humans (24, 57). The island in Arizona may therefore have evolved to maximize its contribution to Arizona's host range. Arizona's genome as a whole may have proceeded down a distinct evolutionary path, leading to its ability to cause disease in such disparate hosts as human and cold-blood animals.
Serogroup and disease-associated relationships are revealed by hierarchical clustering. We employed a clustering analysis tool to determine how the assayed serovars are associated with one another and to identify patterns in gene composition that drive the associations. The unrooted tree generated from clustering the entire data set by both serovars and genes (Fig. 2A) reveals that Arizona is the most distant serovar relative to subspecies I, followed by the two S. bongori strains, which is consistent with the percent present analysis in Table 1. All of the subspecies I serovars fall under the same major node, whereas all of the serovar Typhi strains and Typhimurium strains form their own subnodes (Fig. 2A). The large blue region in the center of the image illustrates the appreciable proportion of core genes that are shared across the serovars.
Clustering of a data set from which the core genes (54% of the entire data set) were removed produced a different unrooted tree (Fig. 2B and C) without altering the composition of the gene clusters (data not shown). The differences between the serovars were emphasized, and a greater level of resolution with regard to the relationships between the strains and serovars, particular with the subspecies I serovars, was achieved. For example, Tm2 appears to be genetically distinct from the other three strains of serovar Typhimurium. In addition, serovar Gallinarum associated with other group D serovars (Pullorum, Enteritidis, and Dublin), and Arizona and S. bongori emphasized their distinction from the subspecies I serovars by forming a distinct node (Fig. 2C). Analysis of the data by PHYLIP Camin-Sokal parsimony analysis (http://evolution.genetics.washington.edu) generated an unrooted tree that revealed a similar relationship between the serovars and strains (data not shown).
Molecular signatures characterize groups of serovars. Previous MLEE work focusing on the human enteric fever-associated serovars could not resolve a relationship between them (9, 50). However, we observe the clustering of serovars Typhi, Paratyphi A, and Sendai into a shared node, suggesting shared genetic features may exist for these serovars (Fig. 2C). A cluster of genes absent from S. bongori, Arizona, and the human enteric fever-associated serovars but present in the other warm-blooded disease-associated serovars is shown in Fig. 3. This pattern of absent/divergent genes suggests that a common Salmonella ancestor possessed these genes but that they were lost or diverged specifically in the enteric fever-associated serovars and in Arizona and S. bongori. Alternatively, perhaps the common Salmonella ancestor lacked these genes and the nonenteric fever-associated serovars acquired them, whereas the human enteric fever-associated serovars did not.
![]() View larger version (91K): [in a new window] |
FIG. 3. A cluster pattern showing genes that are absent in the serovars associated with human enteric fever and cold-blooded animals but present in the other warm-blooded disease-associated serovars. This cluster was pulled out from the larger image shown in Fig. 2B. Indicated is the annotation gene number (STM) and annotated gene information. The color scheme is as described in Fig. 1. Intergenic regions are not shown. Putative genes are indicated by the term "put."
|
Figure 4 is a node of genes specific to group B serovars, which includes Typhimurium and Paratyphi B. Contained within this group B signature is the gene for OafA responsible for the O-antigen acetylation step that generates the group B serotype antigen. Interestingly, even though these serovars share a strong group B pattern, Paratyphi B (which causes a typhoid-like disease in humans) forms a node in the tree distinct from Typhimurium (Fig. 2C). In contrast, MLEE analysis indicated that serovars Typhimurium and Paratyphi B are relatively similar (9, 50). There may be additional disease-associated genetic features that drive the separation of Typhimurium and Paratyphi B (Fig. 4). Nevertheless, the existence of a specific signature for this serogroup indicates that shared genetic factors do not necessarily correlate with disease phenotype.
![]() View larger version (63K): [in a new window] |
FIG. 4. (A) A cluster indicating genes that are conserved only in group B serovars. This cluster was pulled out from the larger image shown in Fig. 2B. Indicated is the annotation information from the LT2 sequence including the gene number (STM number) and the gene name if available. Not indicated are intergenic regions. The color scheme is as described in Fig. 1.
|
The presence of a cluster of absent genes unique to serovars associated with human-specific enteric fever (Fig. 3) indicates that there may have been a convergent evolution of the serovars into their disease niche or that they originated from a common ancestral serovar. This cluster includes genes from the Lpf fimbrial operon previously shown to be absent in serovars Typhi, Paratyphi A, and Sendai (Appendix 3) (56), the sodC-1 gene encoding a periplasmic superoxide dismutase (located within the Gifsy-2 prophage region), and the avrA gene mentioned above. Lpf has been implicated in mediating attachment of serovar Typhimurium to the Peyer's patches in the mouse intestine, a critical early step in the disease process (4, 5). SodC-1 is one of three periplasmic superoxide dismutases found in Salmonella and one of two horizontally acquired from lysogenic bacteriophages (21). The periplasmic location of the Sod enzymes positions them for the neutralization of the toxic effects of exogenous superoxide produced by the host, a feature that is critical for persistence in macrophages and pathogenesis in mice (11, 18, 49). Mutation of either sodC-1 or sodC-2 in serovar Typhimurium has been shown to attenuate the bacteria during mouse infection while mutation of both genes leads to an even more severely attenuated phenotype (17). Interestingly, although mutation of either sodC-1 or sodC-2 in serovar Choleraesuis leads to attenuation in mice, mutation of both does not result in a more severe phenotype (49). While all Salmonella serovars possess sodC-2, the distribution of sodC-1 has been shown to be restricted to a smaller set of serovars, including isolates of the serovars Dublin, Enteritidis, Gallinarum, Pullorum, and Paratyphi B (17), a finding in agreement with our observations and its location in a prophage. Taking into account the important role that this enzyme plays in resistance to the host superoxide response, it has been proposed that there may be selective pressure for the horizontal acquisition of additional Sod genes (49). AvrA, as mentioned above, possesses homology to the Y. pseudotuberculosis-secreted effector YopJ and the plant pathogen X. campestris-secreted protein AvrRxv. YopJ has been shown to interfere with the host immune response by modulating the mitogen-activated protein kinase and NF-
B signaling pathways to prevent the release of proinflammatory cytokines and block the antiapoptotic pathway (38). AvrRxv is a member of a family of plant-secreted "avirulence" proteins whose presence elicits a hypersensitivity response from the plant that limits the spread of disease and thereby restrict host range (29). Recent work has demonstrated a potential role for serovar Typhimurium AvrA in modulating virulence in vertebrates by interfering with proinflammatory NF-
B activation (12).
Concluding remarks. There are several features of microarray-based analysis that make it a particularly attractive complement to the MLEE method for the study of strain and serovar relationships. The MLEE method is limited by the number of enzymes that can be assayed, whereas the microarray allows us to probe most of the genome of a strain of interest. In addition, MLEE has been determined to have a limited ability to accurately assess some relationships due to the presence of different alleles giving the same pattern of mobility (52). Also, by being restricted to alleles of metabolic enzymes, the contribution of horizontally acquired genes and other larger changes to the evolution of a particular genome are overlooked.
There are also several limitations with spotted DNA microarrays. Single-nucleotide polymorphisms are not currently detectable with spotted DNA arrays; for instance, the limit of resolution for conserved genes in the microarray used for these studies (based on a CT18 analysis) is
93% identity. Consequently, caution needs to be employed when applying the term "divergent" to microarray comparison studies since this cutoff is higher than what is currently used in direct sequence comparisons. As a result of this limitation, small genetic changes during serovar evolution and minor gene differences between serovars may not be detected. These restrictions could be addressed by using other independent approaches to verify observations. For example, phylogenic observations made with the microarray could be substantiated and complemented by MLEE or sequence comparison. Microarrays are also incapable of identifying regions present in the serovar of interest but absent from the strain or serovar from which the array was constructed. For example, a comprehensive analysis of clinical isolates of serovar Typhi and correlating them to disease outcome may not be feasible with a serovar Typhimurium array since Typhi possesses genetic elements important in disease (such as multidrug resistance genes and the Vi capsule) that are absent in Typhimurium. Finally, genomic comparisons with microarrays reveal nothing about the effect of gene expression on a serovar or strain's ability to cause disease or persist in a particular environment. One potentially powerful study to address this would be to synchronize the growth of strains of interest and perform RNA expression comparisons at defined time points or under specific growth conditions. Nevertheless, microarrays provide a powerful, high-capacity means to characterize serovars and strains and clearly complement other techniques.
While this study was in preparation, Porwollik et al. described genomic comparisons of Salmonella serovars performed by using a serovar Typhimurium LT2 array (40). We have found that the results of that study and ours are both highly comparable and complementary. We had chosen to incorporate serovars into our analysis that were primarily representative of those in subspecies I associated with disease in humans and animals and had included Arizona and S. bongori as references for their occasional disease association and phylogenic distance, respectively. Porwollik et al. selected serovars that represent all of the subspecies of both S. enterica and S. bongori and incorporated genome information from other genera in order to trace the history of gene acquisition and loss that may have contributed to the emergence of Salmonella and its the widely diverse serovars. One major difference in observation between our analyses is the location of Arizona relative to both S. bongori and subspecies I. This difference can most likely be attributed to the respective designs of the arrays. Whereas Porwollik et al. constructed their array to include entire annotated ORFs and used LT2 as the source of genomic DNA, our group built an SL1344 genomic DNA array to include shorter amplicons corresponding to the region of each ORF that was the most unique and therefore less likely to cross-hybridize. Since the two arrays are inherently different in their design, some dissimilarity would be expected in the results.
We have shown here that a Typhimurium microarray is a useful tool for the genomic comparison of Salmonella serovars. With a few exceptions, the observations made here correspond well with those made in MLEE analysis and DNA association experiments, lending credibility to the use of this tool in inferring phylogenic relatedness. The observation that Arizona is more distant from the subspecies I serovars than currently thought is where our results have significantly deviated from previous studies. In general, although we have observed some correlation between the descriptive taxonomy of the serovars and their genetic relatedness, there are additional genetic loci that influence the clustering of some of the serovars together in a manner that reflects their disease associations. An example of this is the clustering of human enteric fever-associated serovars Typhi, Paratyphi A, and Sendai that we observed despite their distribution in two distinct serogroups. The identification of genes that drive the clustering of these serovars may provide information regarding the role of these genes in determining host range and disease phenotype. Thus, while serogrouping may provide a good first approximation of the genetic relationship between the serovars, it is not necessarily predictive of the phylogenic organization of Salmonella (6), nor does it account for the horizontal exchange of genetic information that could alter the serotype of a Salmonella or influence its evolution into a particular disease or host affiliation. Application of microarray-based genomic comparison could therefore serve as a tool to clarify, if not modify, the way in which we understand the organization of Salmonella.
This work was supported by the Ellison Medical Foundation, grant AI26195 from the National Institutes of Health, and Digestive Disease Center grant DK56339. K.C. is a National Science Foundation Graduate Research Fellow. C.C.K. is supported by a Howard Hughes Medical Institute Predoctoral Fellowship and a Stanford Graduate Fellowship. S.B. and G.D. are supported by The Wellcome Trust and the Biotechnology and Biological Sciences Research Council. C.S.D. is supported by a fellowship from the American Cancer Society.
|
|
|---|
B pathway. J. Immunol. 169:2846-2850.
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»