Previous Article | Next Article ![]()
Journal of Bacteriology, April 2006, p. 2309-2324, Vol. 188, No. 7
0021-9193/06/$08.00+0 doi:10.1128/JB.188.7.2309-2324.2006
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
Coxiella Pathogenesis Section, Laboratory of Intracellular Parasites,1 Laboratory of Persistent Viral Diseases,3 Genomics Core Facility, Rocky Mountain Laboratories, National Institute of Allergy and Infectious Diseases, Hamilton, Montana 59840,4 Department of Medical Microbiology and Immunology, Texas A&M University System, Health Science Center, College Station, Texas 778432
Received 19 September 2005/ Accepted 15 December 2005
|
|
|---|
|
|
|---|
Isolates of C. burnetii derived from a variety of geographical areas and various hosts display considerable genetic homogeneity when examined solely by 16S rRNA gene sequencing (65). However, restriction fragment length polymorphism (RFLP) analysis of genomic DNA (gDNA) reveals considerable heterogeneity in banding patterns (31, 33, 36, 40, 77), indicating genetic diversity between C. burnetii isolates. Using this method, Hendrix et al. (33) categorized 32 C. burnetii isolates into six distinct genomic groups (I to VI). Additional genomic variance was subsequently described by Jager et al. (36) in their RFLP analysis of 80 isolates. Differentiation of C. burnetii has also been achieved by sequence and/or PCR-RFLP analysis of icd (encoding isocitrate dehydrogenase) (5, 46), com1 (encoding an outer membrane-associated immunoreactive protein) (34, 62, 83), and mucZ (known to confer a mucoid property to bacteria) (62). The most recent and extensive survey of C. burnetii genetic diversity was reported by Glazunova et al. (24), who used multispacer sequence typing to place 173 C. burnetii isolates into 30 different genotypic groups.
Most C. burnetii isolates harbor 1 of 4 autonomously replicating plasmids termed QpH1, QpRS, QpDV, and QpDG (40, 75). These plasmids range from 32 to 42 kb in size and share a common 25-kb "core" region along with unique regions (40, 59, 79). QpRS-like plasmid sequences are integrated into the chromosome of some isolates (40, 60, 79). Plasmid types are associated with specific genomic groups. Interestingly, human isolates within genomic groups I, II, and III were all derived from acute disease patients, whereas human isolates within genomic groups IV and V were all derived from chronic disease patients. These correlations led to the controversial hypothesis that plasmids and their associated genome encode pathotype-specific virulence factors (36, 40, 71). Controversy aside, the absolute conservation of chromosomally integrated or autonomously replicating plasmid sequences in all C. burnetii isolates examined to date suggests that these sequences play a critical role in some aspect of C. burnetii biology.
The genetic diversity of C. burnetii is also revealed by production of antigenically and structurally unique lipopolysaccharide (LPS) molecules. Three distinct C. burnetii LPS chemotypes have been described that are associated with specific genomic groups (27, 28, 30, 73), and a potential link between LPS chemotype and C. burnetii virulence potential has been proposed (27). LPS is the only defined virulence factor of C. burnetii (43). Virulent C. burnetii isolated from natural sources and infections all produce a full-length LPS that is serologically defined as "phase I." Serial in vitro passage of phase I C. burnetii in embryonated eggs or tissue culture results in LPS molecules with decreasing molecular weights, culminating in the severely truncated LPS of avirulent phase II organisms. Phase II LPS contains lipid A and some core sugars but is missing O-antigen sugars and appears to represent the minimal LPS structure of C. burnetii (1, 30, 74). Two cloned LPS variants of the virulent Nine Mile phase I (NMI) isolate have been described: Nine Mile Crazy (NMC; intermediate virulence), producing an intermediate-length LPS, and Nine Mile phase II (NMII; avirulent), producing a severely truncated phase II LPS (30, 43). NMII and NMC have large chromosomal deletions (26 and 32 kb, respectively) that eliminate open reading frames (ORFs) involved in the biosynthesis of O-antigen sugars, including the rare sugar virenose (6-deoxy-3-C-methlygulose) (35, 76).
The availability of genome sequences of pathogenic bacteria allows a rapid assessment of intrastrain/species whole genome sequence variation by using comparative genome hybridization (CGH) on DNA microarrays. Comparisons have been made between strains of the same species (18, 50, 56) and between different species of the same genus that differ in virulence potential, environmental origin, and animal host (22, 48). These studies provide insight into pathogen genetic diversity, strain evolution, epidemiology of disease, and virulence potential. Moreover, CGH can be used for high-resolution molecular strain discrimination. Indeed, current serological, PCR, and cell culture-based diagnostic methods fail to discriminate between different C. burnetii isolates (21). The genome-wide variations identified by CGH allow a more definitive analysis of the evolutionary relatedness of bacteria than other phylogenetic methods, such as those that rely on variation in the nucleotide sequence of a limited number of conserved genes (e.g., rRNA and housekeeping genes), RFLP profiles, or diversity in variable-number tandem repeats (37). These methods generally do not resolve the loss of DNA and, in the case of RFLP profiles, a single band in one isolate can correspond to multiple bands of different isolates, showing that loss of a restriction site does not indicate the deletion of a whole DNA fragment and its encoded genes (77).
The C. burnetii NMI (RSA493) genome has recently been sequenced and consists of a circular chromosome of 1,995,275 bp and a plasmid (QpH1) of 37,393 bp that encode 2,095 and 40 ORFs, respectively (63). Analysis of coding ORFs indicates that, although C. burnetii shares similarities in lifestyle and parasitic strategies to other obligate intracellular bacteria, it generally differs considerably with respect to genome size, metabolic and transporter capabilities, the presence of mobile genetic elements, and the extent of genome reduction (23, 55, 57, 63, 84). To further define the genetic diversity of C. burnetii, gDNA of 24 isolates of diverse geographical and environmental origins were hybridized to a custom Affymetrix microarray chip (RMLchip_a) containing probe sets corresponding to all ORFs of the NMI isolate. We present here the first C. burnetii whole-genome phylogeny. Although the genomes of C. burnetii isolates demonstrate considerable relatedness, some isolates are missing a number of ORFs relative to NMI that likely result in different biological properties. Moreover, the results of the present study expand our understanding of the genetics of LPS phase variation and establish a new method for classifying unknown C. burnetii isolates.
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. C. burnetii isolates used in this study
|
Genomic DNA isolation and amplification. Total gDNA was purified directly from purified C. burnetii, infected guinea pig spleen homogenate (BDT isolate), and infected yolk sac (G isolate) by one of two methods. The first method used an UltraClean microbial DNA isolation kit (MoBio Laboratories, Inc.) according to the protocol recommended by the supplier, with an additional heating step (85°C for 10 min) before physical disruption of the bacterial cells. The second method has been previously described by Samuel et al. (58). All DNA was resuspended in distilled H2O and frozen at 20°C. Whole-genome amplification (WGA) was conducted on total DNA yielded from 12.5 µl of infected yolk sac and directly on 1 µl of C. burnetii vacuole suspension without prior DNA purification. Amplifications used a GenomiPhi DNA Amplification Kit (Amersham Biosciences). Specifically, 1 µl of DNA solution or C. burnetii vacuole suspension was added to 9 µl of sample buffer, which was heated to 95°C for 3 min and then cooled to 4°C on ice. A total of 9 µl of reaction buffer and 1 µl of enzyme mix were then added to the cooled DNA sample. The samples were incubated at 30°C for 18 h, followed by inactivation of the Phi29 DNA polymerase by heating at 65°C for 10 min. The samples were cooled on ice and stored at 20°C. Four separate WGA reactions were conducted on DNA from infected yolk sacs, while eight separate reactions were conducted on the C. burnetii vacuole suspension. The respective reactions were pooled, and DNA was purified by using n-butanol. Briefly, 30 µl of sterile distilled H2O and 500 µl of n-butanol was added to each sample, and the mixture was vortexed for 5 s. WGA DNA was pelleted by centrifugation at 12,000 x g for 10 min, air dried, resuspended in 80 µl of distilled H2O, and then quantified spectrophotometrically.
Affymetrix array design.
A custom Affymetrix GeneChip (RMLchip_a) was utilized in the present study (39). The RMLchip_a is an antisense oligonucleotide expression array (18-µm feature size) consisting of
249,690 25-mer probe pairs, with each C. burnetii ORF represented by 16 probe pairs that consist of a perfect-match (PM) probe and a mismatch (MM) probe. The MM probe has a single substitution at position 13 relative to the PM probe. To facilitate the analysis of C. burnetii DNA with potential host cell DNA contamination, during the chip design phase all probe set sequences were pruned against human, rat, and mouse genomes. The same pruning process was conducted to prevent cross-hybridization to the 20 additional bacterial genome sequences on the RMLchip_a. The C. burnetii component of the RMLchip_a consists of probe sets specific for 2,024 full-length genomic ORFs (1,988 chromosomal ORFs and 36 plasmid ORFs) of the sequenced Nine Mile isolate (RSA493) (63). Probe sets specific for 79 disrupted ORFs that contain deletions or point mutations, resulting in premature stop codons or frameshifts, were also included. Probe sets specific for one representative of the 21 IS1111A, 5 IS30, and 3 ISAs1 IS element families were also included. The total coding region of the C. burnetii genome is 1,705,446 bp. The probe sets on the RMLchip_a encompass ca. 30% (511,725 bp) of this region.
Microarray sample preparation and hybridization conditions. Microarray hybridizations were conducted with 7 and 30 µg of gDNA extracted from purified C. burnetii and infected guinea pig spleen homogenate (BDT isolate), respectively. We also used 7 and 10 µg of pooled WGA DNA samples from C. burnetii vacuole suspensions and infected yolk sacs, respectively. Total DNA was fragmented with DNase I (Roche Applied Science; 0.001 U/µg of DNA) for 10 min at 37°C in DNase I buffer. DNase I was denatured by heating (10 min at 98°C) and 1 µl of the fragmented gDNA was electrophoretically analyzed with an Agilent 2100 Bioanalyzer (Agilent Technologies, Inc.) to confirm the generation of DNA fragments between 50 and 200 bp. Fragmented DNA was then 3' end labeled with biotin-ddUTP for 60 min at 37°C by using a BioArray terminal labeling kit (Enzo Life Sciences, Inc.). The fragmented, end-labeled cDNA was added to hybridization solution that was used to probe the RMLchip_a. Target hybridizations, RMLchip_a washing, staining, and scanning were conducted according to the manufacturer's suggestions at the National Institute of Allergy and Infectious Diseases Affymetrix Core Facility (SAIC-Frederick, Inc.). Prior to hybridization to the RMLchip_a, the gDNA samples in hybridization solution were heated to 95°C for 5 min and then incubated for 5 min on ice, followed by equilibration to room temperature. Hybridization was carried out for 16 h at 50°C with rotation in a hybridization oven (Affymetrix) at 60 rpm. After removal of the hybridization solution, the RMLchip_a was washed and stained using a Fluidics Station 400 according to directions stated by Affymetrix for the GeneChip Pseudomonas aeruginosa Array. Labeled RMLchip_a's were then scanned by using an Affymetrix GeneChip Scanner 3000.
Microarray data analysis. Images from scanned microarrays were processed with Affymetrix GeneChip Operating Software (GCOS) version 1.2 (Affymetrix). One to four hybridizations per isolate were performed with separately labeled gDNA or WGA DNA. The difference in intensity value for each probe pair was calculated by subtracting the MM probe intensity value from the PM probe intensity value. The average difference in intensity values for the entire 16 probe pairs for each ORF was calculated by using the statistical expression algorithm incorporated into GCOS. Normalization between arrays was achieved by implementing array-specific scaling factors that were calculated as a ratio of 500 divided by the average of the average difference in intensity values for all ORFs represented on the RMLchip_a. For each ORF, the average difference in intensity was multiplied by the array specific scaling factor. Using Partek Pro software version 6.04.1208 (Partek), sample/reference ratios of signal intensity were calculated and transformed to logarithm base 2. The ratios were then normalized by using the median log2 ratios of zero, and the final value was calculated from the average of all replicates. ORFs that had a log2 ratio of less than 1 were considered absent and subjected to PCR and sequencing validation.
Microarray validation. PCR validation was conducted for all ORFs indicated as absent by microarray. The PCR strategy used is shown in Fig. S1 in the supplemental material. PCRs utilized primers specific to ORFs flanking putative deletions. They were first used to amplify the two ORFs immediately flanking polymorphic regions to determine whether potential deletions extended into the flanking ORFs. PCR amplifications were then conducted across polymorphic regions to determine whether ORFs within this region were deleted, contained inserted DNA, or were intact. Amplification of NMI genomic DNA was conducted as a control. The ORF-specific primers used in the present study are listed in Table S1 in the supplemental material. Primer sequences were based on the published C. burnetii Nine Mile (RSA493) sequence (63) and were manufactured by Sigma Genosys, LP. PCRs (50 µl) containing 100 ng of gDNA were performed by using either AccuPrime Pfx DNA polymerase (Invitrogen Corp.) for the predicted short PCR products (<4 kb) or Accuprime Taq DNA polymerase High Fidelity (Invitrogen Corp.) for the predicted long PCR products (>4 kb). PCR products analyzed by DNA sequencing were processed by using a QIAquick PCR purification kit (QIAGEN) and sequenced by the RML Genomics Core Facility (Rocky Mountain Laboratories, Hamilton, MT). DNA sequencing reactions were performed with an ABI Prism Dye Terminator Cycle Sequencing Ready Reaction sequencing kit according to the manufacturer's instructions (Applied Biosystems, Inc.), and reactions were analyzed by using a model 3700 automated DNA sequencer (Applied Biosystems, Inc.).
Phylogenetic analysis. A numerical score was assigned to each isolate ORF based on the following criteria: 1 for present, 2 for partially deleted, 3 for absent, and 4 for containing a point mutation or small insertion (<50 bp). ORF designations were then analyzed by using PAUP software version 4.0b10 (PPC; Sinauer Associates, Inc). Phylogenetic trees were constructed by using maximum-parsimony with equal weighting following a heuristic search.
LPS extraction and analysis. C. burnetii LPS was extracted by a modified hot phenol method that exploits the molecule's hydrophilic nature (30). Briefly, 1 mg (dry weight) of purified C. burnetii was suspended in 1 ml of 50% phenol. The sample was boiled for 10 min, incubated on ice for 5 min, and then centrifuged at 14,000 x g for 5 min. The aqueous phase was collected, and the extraction was repeated on the pellet. The aqueous phases from both extractions were pooled (ca. 1 ml) and vacuum dried overnight. The resulting pellet was dissolved in 100 µl of distilled H2O, the sample was dried again, and the pellet was dissolved in 50 µl of sample buffer. Samples were then separated on a 12.5% sodium dodecyl sulfate-polyacrylamide gel. LPS was stained by using a SilverQuest Silver staining kit (Invitrogen Corp).
Deposition of microarray data in the public databases. The microarray data have been deposited in the microarray database at European Bioinformatics Institute (http://www.ebi.ac.uk) under the accession number E-TABM-35.
|
|
|---|
Figure 1 depicts chromosomal (Fig. 1A) and plasmid (Fig. 1B) ORF polymorphisms relative to NMI that were present in individual C. burnetii isolates as identified by CGH. In the present study a polymorphism is defined as a single or multiple linked ORF(s) where negative hybridization occurred due to complete or partial ORF deletion, point mutation(s), or DNA insertion. The number of polymorphic ORFs ranged from 0 in genomic group I isolates to 87 in genomic group V isolates, which represents 4.1% of the NMI coding capacity (Table 2). Between all isolates, a total of 139 ORF polymorphisms (109 chromosomal and 30 plasmid) were identified. A total of 34 of these ORFs have putative functions (Table 3), while 93 encode hypothetical proteins. The remaining 12 polymorphic ORFs are annotated but are predicted to be nonfunctional because of a frameshift or truncation. The number of predicted single-event polymorphisms, i.e., changes predicted to occur once but affecting either single or multiple ORFs, varied between isolates and ranged from 0 in genomic group I isolates to 20 in genomic group V isolates. The size of polymorphisms ranged from a single ORF (e.g., CBU0478 in Du 7E9-12) to 18 ORFs (e.g., plasmid ORFs CBUA0028 to CBUA0005 in genomic group V isolates with integrated "plasmid-like" sequences). Isolates within previously described genomic groups all had the same polymorphisms, confirming their relatedness based on previous RFLP studies (33).
![]() ![]() ![]() View larger version (165K): [in a new window] |
FIG.1. Comparison of the genomes of 22 C. burnetii isolates with the genome of the Nine Mile (RSA493) reference isolate. Labeled DNA was hybridized to the RMLchip_a containing probe sets specific for all ORFs of the Nine Mile isolate. ORF polymorphisms were identified as ORF probe sets showing log2 hybridization ratios lower than 1 that were validated as described in Materials and Methods. Red ORFs were completely deleted, blue ORFs were partially deleted, and yellow ORFs contained point mutation(s) or small insertions. The number and location of each polymorphism, consisting of single or multiple ORFs, is shown on the right. (A) Chromosomal polymorphisms 1 through 44. (B) Plasmid polymorphisms 45 through 51. The number and location of each polymorphism, consisting of either single or multiple ORFs, is shown on the right. Predicted nonfunctional NMI ORFs containing point mutations (PM) or frameshift (FS) mutations are indicated.
|
|
View this table: [in a new window] |
TABLE 2. ORF polymorphisms of 22 C. burnetii isolates relative to the Nine Mile reference isolate
|
|
View this table: [in a new window] |
TABLE 3. Nine Mile ORFs with a predicted function that are completely or partially deleted in at least one C. burnetii isolate
|
Seven polymorphisms (45 to 51) relative to the QpH1 plasmid of NMI were observed. One polymorphism (45) was unique to genomic group V isolates that have plasmid-like sequences integrated into the chromosome. A plasmid RD was observed that includes polymorphism 47 of genomic groups IV, V, Le B, and Q321 and polymorphism 46 of Du 7E9-12, which is a subregion of polymorphism 47. A second plasmid RD is represented by the remaining plasmid polymorphisms (48 to 51) that were observed in genomic groups IV, VI, and the ungrouped isolates Le B and Q321. Although the plasmid content of Le B is unknown, it likely carries QpDV since the plasmid polymorphisms of this isolate were identical to Q321, which contains this plasmid (75). Only 10 ORFs were conserved between all plasmid types and integrated plasmid-like sequences (Table 4), with one (CBUA0009) predicted to be nonfunctional in NMI.
|
View this table: [in a new window] |
TABLE 4. Conserved ORFs of QpH1, QpRS, QpDV, QpDG, and integrated plasmid sequences
|
|
View this table: [in a new window] |
TABLE 5. Characterization of genomic polymorphisms
|
6-kb deletion (CBU1209 to CBU1214), while the deletion compromising polymorphism 31 was
6.7 kb in size (CBU1210 to CBU1216). Polymorphism 30 (CBU1209 to CBU1215), consisted of an
7.3-kb deletion that had a different 5' deletion point from polymorphism 29. Polymorphisms 48 to 51 (CBUA0024 to CBUA0026) constitute a plasmid RD shared between QpRS, QpDV, and QpDG. Microarray validation indicated that CBUA0026 contains an insertion of variable size. Specifically, in QpRS CBUA0026 is disrupted by a 3,079-bp insertion. Remnants of the same insertion were present in CBUA0026 of QpDV and QpDG, consisting of 777 and 2,676 bp of the 3' sequence, respectively. Further polymorphisms in this RD were observed in QpDV and QpDG that consisted of deletion of part of CBUA0026, all of CBUA0025, and part (QpDV) or all (QpDG) of CBUA0024. Collectively, these data suggest these and other RDs are hotspots of genomic sequence variation.
Putative functions of polymorphic ORFs. C. burnetii encodes a unusually high proportion of hypothetical proteins (63). Accordingly, our analysis showed that 93 (67%) of polymorphic ORFs are annotated as encoding hypothetical proteins. An additional 12 (9%) are annotated as nonfunctional. The remaining polymorphic ORFs have a range of predicted cellular functions, including metabolic control, DNA/RNA processing, nutrient transport, stress response, plasmid stability, and putative host function modulation (Table 3).
Five polymorphic ORFs are predicted to encode proteins containing ankyrin or tetratrico peptide repeats, domains typically found in eukaryotic proteins that facilitate protein-protein interactions (25, 44). In eukaryotes, proteins with these domains serve a diverse set of functions involving cytoskeletal interactions, transcriptional regulation, signal transduction, and cell cycle regulation (25, 44). Of the 14 ankyrin repeat proteins (ANKs) encoded by the C. burnetii genome, 4 (CBU0071, CBU0072, CBU1213, and CBU1609) were polymorphic: 3 in genomic group V and Le B and 1 in genomic groups II, IV, and Q321. In genomic group V isolates, one polymorphic ORF (CBU1774) was predicted to encode a tetratrico peptide repeat protein.
Sixteen polymorphic ORFs were common to genomic groups IV and V and the Q321 and LeB isolates. One of these, CBUA0016 (cbhE'), was previously thought to be specific to isolates causing acute disease (42), an idea later discounted upon analysis of a larger group of isolates (65). A second polymorphic ORF (CBU0952) encodes a 28-kDa immunodominant outer membrane protein termed AdaA that was previously found to be synthesized only by acute disease isolates in genomic groups I and II (82). A third polymorphic ORF (CBU0953) encodes an amino acid permease. ORFs encoding 14 amino acid and 3 peptide transporters are present in the C. burnetii genome, suggesting functional redundancy within this family of proteins (63).
In addition to ANKs and amino acid/peptide transporters, polymorphisms were identified in seven additional ORFs that are part of large gene families. First, two of at least five predicted nucleotidyltransferases, a family of proteins with roles in DNA repair, transcription, DNA replication and RNA processing, were polymorphic in Q321 and Le B, and one was polymorphic in genomic group IV isolates. Second, 2 of 21 predicted ATPases, proteins associated with a variety of cellular functions, were polymorphic in genomic group IV, while one was polymorphic in genomic groups II, III, Q321, and Le B. Lastly, two of five ORFs predicted to encode multidrug resistance B proteins were polymorphic in genomic group IV, Q321, and Le B.
The acute disease isolate He contains a unique polymorphism (polymorphism 41) that encompasses relB (CBU1991) and the beginning of relE (CBU1992). In Escherichia coli, these genes regulate the stringent response to amino acid starvation that results in inhibition of translation (11, 12). A second unique He polymorphism (polymorphism 11; CBU0598) is predicted to encode a protein that is part of the MutT/NUDIX (nucleoside diphosphate linked to some other moiety, X) family of proteins, a family consisting of widely distributed enzymes that function by removing potentially hazardous metabolites and/or modulating the buildup of biochemical pathway intermediates (6). Genomic group V isolates contained two polymorphic ORFs with homology to luxR. LuxR is a transcriptional activator that acts as a receptor for molecules involved in bacterial quorum sensing (26).
Possible mechanisms of polymorphism generation. Microarray validation indicated that most polymorphisms were generated via DNA insertion or DNA excision and that these events were likely facilitated by flanking repetitive sequences or transposon-like sequences. We divided polymorphisms into five categories based on proximity to repeat sequences or transposases and, if present, the character of inserted DNA. The first category was polymorphisms consisting of deletions that are immediately flanked by repeat sequences. These sequences were typically small (<9 bp), but in four cases large repeats ranging in size from 49 to 175 bp were observed (polymorphisms 25, 26, 35, and 41). Interestingly, the largest of these repeats was a family of three, having ca. 89% identity, with one repeat each located between CBU1015 and CBU1016, CBU1019 and CBU1020, and CBU1022 and CBU1023. In genomic groups II, III, and V, homologous recombination probably occurred between the first two repeats resulting in the deletion of CBU1016 to CBU1019 (polymorphism 25), while recombination between the first and third repeat likely resulted in deletion of CBU1016 and CBU1022 (polymorphism 26) in Q321 and Le B. A second category was polymorphisms consisting of deletions that are less than five ORFs away from a transposase (polymorphisms 1, 16, 28, 29 to 31, and 32). A third category was polymorphisms consisting of deletions where the corresponding DNA in NMI contains a transposase (polymorphism 41), suggesting that the excision of this element in other isolates resulted in this polymorphism. A fourth category was polymorphisms that resulted from DNA insertion. In all but one case, the inserted DNA had no matches to sequences in the National Center for Biotechnology Information database (http://www.ncbi.nlm.nih.gov/BLAST/) and presumably represents novel C. burnetii sequences. The exception was an insertion of an IS1111A transposase-like sequence in polymorphism 2. The fifth category consisted of "clean" deletions whereby an entire region was missing and with no repeat sequences in the immediate vicinity (e.g., polymorphism 31).
LPS phase variation.
The only known difference between virulent phase I and avirulent phase II C. burnetii is production of a severely truncated LPS by phase II organisms (2). To better understand the genetic basis of LPS phase variation and the attenuated virulence of C. burnetii phase variants, we conducted CGH of the genomes of NMC, NMII, and the Australian QD (Au) isolate. The LPS phase variation phenomenon is illustrated in Fig. 2A. NMII produced a severely truncated LPS (
2.5 kDa) relative to the laddered molecule of NMI (
19,
15, and
14.3 kDa). NMC produced an intermediate length LPS (
14.3 kDa). The Au isolate, subjected to 177 egg passages and serologically defined as phase II, produced an LPS with a molecular mass similar to that of NMII (
2.5 kDa). The only NMII and NMC polymorphisms revealed by CGH were 20 ORFs (CBU0679 to CBU0698) and 24 ORFs (CBU0676 to CBU0699), respectively, that are included in the deletions previously described by Hoover et al. (35) (Fig. 2B). No other polymorphisms were detected in either isolate. Interestingly, the Au isolate contained no polymorphisms, indicating that a large deletion is not required for LPS phase variation.
![]() View larger version (36K): [in a new window] |
FIG. 2. LPS profiles and ORF polymorphisms of C. burnetii isolates synthesizing different LPS chemotypes. (A) Silver-stained sodium dodecyl sulfate-15% polyacrylamide gel electrophoresis profiles of purified LPS molecules from C. burnetii Nine Mile I (NMI), Nine Mile II (NMII), Nine Mile Crazy (NMC), and Australia QD (Au) isolates. The relative sizes of molecular mass markers in kilodaltons are shown on the left. (B) Comparison of the genomes of NMII, NMC, and Au isolates with the genome of the Nine Mile (RSA493) reference isolate. Labeled DNA was hybridized to the RMLchip_a containing probe sets specific for all ORFs of the Nine Mile isolate. ORF polymorphisms were identified as ORF probe sets showing log2 hybridization ratios lower than 1. Red ORFs were completely deleted and blue ORFs were partially deleted. Partial ORFs that were previously identified by Hoover at al. (35), but not detected by the CGH, are depicted in green.
|
Phylogenetic analysis of C. burnetii isolates. The phylogenetic relationships between the isolates used in the present study (excluding NMII and NMC) were analyzed by using PAUP software. Phylogenetic trees were constructed using microarray data based on hybridization signals of all 2,103 chromosomal and plasmid ORFs (Fig. 3A), 2,063 chromosomal ORFs (Fig. 3B), or 40 plasmid ORFs (Fig. 3C). This analysis confirmed the RFLP-based genomic groupings (I to VI) as previously proposed by Hendrix et al. (33). Although, the Id and He isolates both harbor QpH1, they were more related to each other than to genomic group I isolates that also carry this plasmid. Indeed, if only chromosomal polymorphisms are considered, the Du 7E9-12 isolate was most closely related to genomic group I; however, when considering just plasmid polymorphisms, this isolate was most similar to genomic group IV isolates. Genomic group V isolates, having the greatest number of polymorphic ORFs with 87, were most divergent from genomic group I. Q321 and Le B isolates were most closely related to genomic group IV isolates; however, unique polymorphisms indicate that they represent two novel genomic groups of C. burnetii referred to here as genomic groups VII and VIII, respectively.
![]() View larger version (9K): [in a new window] |
FIG. 3. Parsimony analysis of chromosomal and plasmid ORF content of C. burnetii isolates. Phylogenetic trees were constructed by using both chromosomal and plasmid ORFs (with 123 of 132 variable characters being parsimony informative and equally weighted) (A), chromosomal ORFs (with 93 of 102 variable characters being parsimony informative and equally weighted) (B), or plasmid ORFs (with all 30 variable characters being parsimony informative and equally weighted) (C). One of the two most parsimonious trees for each of these groups is depicted. (See Fig. S2 in the supplemental material for the alternative tree.) Proposed genomic groups (I to VIII) are shown in panel A. The horizontal bar indicates the number of character changes.
|
|
|
|---|
Consistent with other obligate intracellular bacteria (3, 23, 55, 84), and as proposed by Seshadri et al. (63), our results suggest that the C. burnetii genome is undergoing reductive evolution. Accordingly, we found very few polymorphisms resulting from DNA insertion. The C. burnetii genome contains 83 pseudogenes, 79 of which are represented on the RMLchip_a. Twelve pseudogenes (15.2%) are completely or partially deleted in at least one C. burnetii isolate. Conversely, only 6.9% of coding ORFs are deleted; thus, nonfunctional ORFs are apparently being targeted for deletion at a higher frequency than coding ORFs. Moreover, among deleted functional ORFs, there appears to be a bias toward deletion of those with potential functional redundancy. For example, two of five predicted nucleotidyltransferases are missing in Q321 and Le B isolates. Other examples include the deletion of 4 of 14 ankyrin repeat proteins, 2 of 21 predicted ATPases, and 2 of 5 multidrug resistance B proteins.
Hendrix et al. (33) were equivocal on the genetic distance between genomic groups I, II, and III as each had a distinct RFLP pattern, but all harbored the same plasmid (QpH1). The ability of CGH to screen every ORF in the C. burnetii genome allowed us to confirm that isolates within genomic groups I, II, and III are indeed genetically distinct. Eight and four chromosomal polymorphisms distinguish genomic group II and III isolates, respectively, from genomic group I isolates, and six polymorphisms distinguish genomic group II isolates from genomic group III isolates. Phylogenetic trees constructed using CGH data suggest a divergent evolution whereby genomic group I is the ancestor of genomic group III and genomic group III is the ancestor of genomic group II (Fig. 3). Genomic group I isolates were acquired from geographically diverse locations including Australia, United States, Africa, Turkey, and Panama, with the dates of isolation ranging from 1935 to 1967. Interestingly, our results indicate that the gene content of these isolates is exactly the same relative to NMI, suggesting a worldwide spread of a common ancestor that has since undergone limited evolution. C. burnetii genome content can also vary among isolates obtained at a similar time and location. Specifically, Du 7E9-12 (rodent isolate) and Du 5G61-63 (tick isolate) were both obtained in Dugway, Utah, in 1958, but only Du 7E9-12 has polymorphisms relative to NMI that are largely the consequence of its carriage of the plasmid QpDG. We also determined the evolutionary relationships of four isolates previously ungrouped by RFLP analysis. BDT and Du 5G61-63 have no ORF polymorphisms relative to genomic group I isolates, and hence, fall within this genomic group. Q321 and Le B are highly similar to the genomic group IV isolates, suggesting a recent common ancestor, yet they have diverged enough to form two new genomic groups (VII and VIII, respectively). Interestingly, the only difference between Le B and Q321 is polymorphism 2, which is present in Le B and genomic group V isolates. Sequencing indicates this polymorphism resulted from insertion of an IS1111 transposon-like sequence at the same location. Genomic groups IV, V, VII, and VIII comprise a distinct clade from genomic groups I, II, III and VI. Within this clade, genomic group V is the most divergent from NMI, being most closely related to the genomic groups IV, VII, and VIII.
Although predisposing host factors are clearly important in the development of human acute or chronic Q fever (32, 51), C. burnetii isolates also exhibit distinct phenotypes in terms of infectivity of cell culture (53), cytopathic effects on host cells (53, 66), and pathogenicity in animal models of Q fever (43, 64, 68). The general consensus of these studies is that the NMI reference isolate (genomic group I), biologically representative of human acute disease isolates, is more infectious, grows faster in cell culture, and is more virulent in animal models of infection than chronic disease isolates of genomic groups IV and V. Evidence of C. burnetii pathotype-specific ORFs was recently provided by Glazunova et al. (24), who found by multispacer sequence typing analysis that both plasmid type and genotype correlate with disease outcome. The CGH identification in the present study of polymorphic ORFs associated with chronic disease isolates and isolates of attenuated virulence allows speculation concerning their roles in Coxiella pathogenicity and disease outcome.
The degree of variance between NMI and chronic disease isolates of genomic groups IV and V ranged from 2.3% (51 polymorphic ORFs) to 4.1% (87 polymorphic ORFs), respectively, with most polymorphic ORFs encoding hypothetical proteins. Only seven chromosomal and nine plasmid ORFs are polymorphic in both genomic groups, suggesting the loss of these ORFs may result in the unique phenotypes of these isolates. The NMI genome contains 14 ANK-encoding ORFs (63) and, although these proteins are rare in prokaryotes, ANKs are also found in the intracellular bacteria Legionella pneumophila (9), Anaplasma spp. (7, 49), Ehrlichia ruminantium (15), Wolbachia pipientis (20, 80), and Rickettsia spp. (4, 47), where they are suspected of modulating host functions. Four ANK-encoding ORFs are missing in genomic group IV and V isolates, with one (CBU1609) having homology to AnkA of A. phagocytophilum (49). In A. phagocytophilum-infected human promyelocytic leukemia HL-60 cells, AnkA is secreted, binds host DNA, and possibly regulates transcription (8, 49). The absence of CBU1609 and other ANK-encoding genes in C. burnetii chronic disease isolates indicates they are not required for potential modulation of host gene function during the disease process, with the caveat that ANK proteins may be functionally redundant. Two ORFs (CBU1804 and CBU1805) encoding proteins with similarity to the quorum-sensing regulator LuxR were deleted in genomic group V isolates; however, the lack of additional ORFs related to quorum sensing within the C. burnetii genome suggests these ORFs serve other regulatory functions. Although one or a few polymorphic ORFs may explain the association of C. burnetii genomic groups with chronic disease, another possibility is that deleted ORFs in toto simply result in the slower growth of these isolates and, consequently, the stimulation of an attenuated immune response. Indeed, slow growth resulting from a dramatically reduced genome is thought to contribute to Mycobacterium leprae's known ability to cause chronic infection (13, 81).
Du 7E9-12 of genomic group VI represents a naturally occurring isolate that is weakly pathogenic in a guinea pig model of infection (68). It contains six chromosomal and eight plasmid polymorphisms relative to NMI. However, only one unique polymorphism is predicted to result in loss of a functional ORF. It consists of a partial deletion of CBU0478 which encodes a hypothetical protein. This presents the possibility that CBU0478 is necessary for virulence. There is precedence for small genetic differences conferring dramatically different virulence properties in obligate intracellular bacterial pathogens. For example, the gene contents of Chlamydia trachomatis and C. muridarum (formally C. trachomatis, biovar mouse pneumonitis), human and mouse specific pathogens, respectively, are >99% identical (52). However, these pathogens have a strict tropism for their respective animal hosts, a phenomenon apparently conferred by just a few genes in a small chlamydial plasticity zone (45).
Collectively, 10 ORFs are polymorphic in acute disease isolates of genomic groups II and III. Three ORFsCBU1991, CBU1992, and CBU0598predicted to encode RelB, RelE, and a MutT/NUDIX family protein, respectively, are polymorphic in the He isolate of genomic group II. The loss of RelB and RelE, which regulate the stringent response (11, 12), might affect the stationary-phase physiology of the He isolate and, consequently, development of the stable small cell developmental form (14). The absence of CBU0598, might limit the He isolate's ability to detoxify hazardous materials or prevent unbalanced buildup of normal metabolites (6), thereby detrimentally affecting its growth. Both hypotheses should be testable using in vitro infection models.
We recognize caveats in our attempt to correlate C. burnetii genetic polymorphisms with specific biological and virulence properties of isolates. First and foremost, our analysis is unidirectional in that all isolates were compared to the sequenced NMI isolate. As such, we cannot ascertain the level of novel coding capacity carried by tested isolates, a distinct possibility considering the range of isolate genome sizes roughly determined by pulse gel electrophoresis (78). Second, the tiled oligonucleotide probe sets on the RMLchip_a cover ca. 30% of the NMI coding capacity. Therefore, mutations or deletions outside of this region will not be detected. It is also possible that some insertions and point mutations within tiled regions were not detected. We were initially concerned that genomic nucleotide sequence variation between NMI and test isolates might result in negative hybridization signals for intact and functional genes. However, nearly all polymorphisms were confirmed by PCR validation as true deletions, with only the occasional polymorphism resulting from synonymous sequence variation.
Although the impact of most polymorphic ORFs on C. burnetii virulence is speculative, deletion of ORFs involved in LPS biosynthesis is clearly associated with attenuated virulence (51). Our CGH results for NMC and NMII producing truncated LPS are consistent with previous studies showing each variant with a single large chromosomal deletion (35, 76). The NMC deletion includes the 21 ORFs missing in NMII and extends by two ORFs on each side of the NMII deletion (35). Most deleted ORFs are predicted to function in O-antigen biosynthesis, and their absence explains the intermediate truncation of NMC LPS. However, their absence does not explain why NMII, with its smaller deletion, produces a more truncated LPS than NMC. NMII likely has an additional point/frameshift mutation, small deletion, or transposon insertion in a gene early in the LPS biosynthetic pathway that is not detected by CGH. An analogous situation may exist with the Au isolate, which produces a severely truncated LPS similar to NMII but is identical to NMI by CGH. Thus, our data confirm not only that large chromosomal deletions are not required for C. burnetii LPS phase variation, as previously proposed by Thompson et al. (72), but also that the avirulence of NMII C. burnetii appears to be entirely attributable to deficient LPS production since other ORF deletions were not detected.
Plasmid polymorphisms specific to QpH1, QpRS, QpDV, and integrated plasmid-like sequences were confirmed via comparison with their deciphered sequences (38, 63, 79), while polymorphisms of QpDG are described here for the first time. Nine predicted functional plasmid ORFs are conserved among all plasmid types or integrated-plasmid sequences. Interestingly, two of these ORFs, CBUA0008 and CBUA0010, encode proteins that exhibit similarity to phage proteins involved in tail assembly and site-specific recombination, respectively. These ORFs may be remnants of a phage that had integrated into the C. burnetii plasmid. Flanked by these phage genes is a degenerate ORF (CBUA0009) annotated as a putative toxin. This protein has the highest similarity to insecticidal and nematocidal toxin synthesized by the waterborne bacterium Chromobacterium violaceum (16) and nematode symbionts Xenorhabdus bovienii and Photohabdus luminescens (10, 19), suggesting it was acquired by C. burnetii via horizontal transfer mediated by phage integration. This ORF may have been beneficial to a C. burnetii progenitor with an insect host range. Interestingly, the closest relative of C. burnetii is Rickettsiella grylli, a cricket pathogen (54). Excluding phage-related ORFs, seven plasmid ORFs are conserved among all C. burnetii isolates examined in the present study, suggesting a critical role in some aspect of C. burnetii intracellular parasitism. Only CBUA0011 has an annotated function, with similarity to ripX of Bacillus subtilis, a gene involved in chromosomal partitioning (61).
Movement of mobile genetic elements clearly contributes to C. burnetii genomic plasticity. C. burnetii contains a large number of insertion sequences (IS elements) (63) relative to most obligate intracellular bacteria (13, 47, 67, 70). Of the 51 single-event polymorphisms, 9 are within five ORFs of a transposase or contain a transposase, indicating that these IS-type elements have a role in generating C. burnetii genetic diversity. Indeed, Southern blotting shows a range in the number of IS elements between C. burnetii isolate genomic groups (manuscript in preparation). By this analysis, isolates within a genomic group have identical hybridization patterns, supporting the phylogenetic relationships revealed by CGH. Movement and/or loss of these mobile elements could be a mechanism by which C. burnetii rearranges and/or reduces its genome to better suit its intracellular lifestyle.
In summary, the present study provides the first comprehensive whole-genome genotyping of C. burnetii. We show that, in general, the genomes of C. burnetii isolates obtained from a wide range of biologically and geographically diverse sources are highly conserved. However, notable ORF polymorphisms occur that may contribute to the virulence potential and other biological properties of C. burnetii isolates. CGH assessment of C. burnetii genome content can help to identify cross-protective subunit vaccine candidates for protection against Q fever and facilitate the development of new diagnostic tools and, as an epidemiological method, to classify clinical, forensic, and environmental isolates. Indeed, in conjunction with whole-genome amplification, rapid whole-genome typing of isolates in clinical samples can be achieved without propagation of the organism. Finally, a CGH approach to identify potential pathotype-specific virulence genes is especially valuable considering the lack of genetic systems for C. burnetii.
This research was supported by the Intramural Research Program of the National Institutes of Health, National Institute of Allergy and Infectious Diseases.
Supplemental material for this article may be found at http://jb.asm.org/. ![]()
|
|
|---|
immune evasion is linked to host infection tropism. Proc. Natl. Acad. Sci. USA 102:10658-10663.This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»