Previous Article | Next Article ![]()
Journal of Bacteriology, January 2006, p. 249-254, Vol. 188, No. 1
0021-9193/06/$08.00+0 doi:10.1128/JB.188.1.249-254.2006
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
Institute of Medical Microbiology and Hospital Epidemiology, Hannover Medical School, Carl-Neuberg-Str. 1, 30625 Hannover, Germany,1 Institute of Hygiene and Microbiology, University of Würzburg, Josef-Schneider-Str. 2, D-97080 Würzburg, Germany,2 Berna Biotech, Ltd., Berne, Switzerland,3 Louisiana State University Medical Center, New Orleans, Louisiana,4 Division of Comparative Medicine, MIT, Cambridge, Massachusetts 02139,5 Mathematical Genetics Group, Department of Statistics, University of Oxford, Peter Medawar Building for Pathogen Research, South Parks Road, Oxford OX1 3SY, United Kingdom6
Received 18 July 2005/ Accepted 6 October 2005
|
|
|---|
|
|
|---|
Much of the current knowledge about genetic diversity of H. pylori has been gained from population genetic analysis of sequence data obtained from H. pylori strains isolated from unrelated persons of diverse geographical origins (1, 17). Since H. pylori infection is chronic and can persist for decades, an alternative approach is the analysis of genetic relationships between H. pylori strains that have been isolated sequentially from the same patient. We have used mathematical modeling on sequence data from 24 pairs of sequential H. pylori isolates from Colombia and the United States to estimate basic parameters of recombination and mutation of H. pylori in vivo. The data showed that the estimated mean size of imported fragments was only 417 bp, the shortest import length that has been reported for a bacterium. Half of the genome is replaced by homologous sequence in a time frame estimated to be from 40 to 2,000 years (6). The wide credibility region of this estimate is principally due to a high degree of uncertainty about the time since the last common ancestor of each pair of strains, which were taken from middle-aged patients at intervals from 3 months to 4 years, but even the higher values in this range represents extremely rapid import of DNA compared with other bacteria (6).
H. pylori strains not only show allelic diversity but also differ in their gene contents. The two completely sequenced genomes of H. pylori strains 26695 and J99 share only 94% of their genes (2), whereas approximately 7% of the genes are unique for each strain, respectively (22). A comparison of 15 strains with DNA microarrays showed that in this set of strains, 1,281 of 1,643 genes present on the array (combined from 26695 and J99 genomes) were shared by all strains, whereas 22% of genes were only present in some of the strains (15). We have recently further refined the estimate of the number of genes present in all H. pylori strains to 1,111 by comparative genome hybridizations performed with a globally representative collection of 54 H. pylori strains (9). Furthermore, Israel et al. found numerous genomic differences between H. pylori J99, a strain whose genome sequence was published in 1999, and multiple isolates cultured from the same patient at the same time as J99 and 6 years later (10). Similar results were found in another study, where two clones of a patient were taken at the same time and compared to each other (3).
These results suggested that the H. pylori genome has a highly plastic gene content. To quantitate the rate of genomic changes (gene loss or acquisition) during chronic colonization and to compare it to rates of allelic replacement events, we have used whole-genome DNA microarray hybridization to study gene loss and gene acquisition in the same collection of sequential H. pylori isolates previously studied to determine the rates of mutation and recombination in vivo (6). Where the microarray data indicated a possible genomic change, the event was verified by PCR and sequence analysis. Changes of gene content occurred in 5 of the 21 pairs analyzed, indicating significant plasticity of the genome content. However, quantitative analysis shows that genetic events leading to loss or acquisition of genes are far less frequent than events leading to allelic exchanges alone.
|
|
|---|
Whole-genome DNA microarray. The composition of the H. pylori whole-genome microarray used in this study has been previously described (14). It is based on PCR products derived from the two available H. pylori whole-genome sequences of strains 26695 (22) and J99 (2), and contains 1,655 probes representing 96.1% of genes present in H. pylori 26695 or J99. The performance of the array for H. pylori genome comparisons was validated by experiments comparing the two strains, 26695 and J99, whose genomes have been sequenced. These experiments showed that we detected 100% of the known strain-specific genes of J99 and 97.3% of the known strain-specific genes of 26695 as defined by Salama et al. (15). Four strain-specific genes of 26695 (HP0315, HP1001, HP1516, and HP1537) were incorrectly scored as present in J99 when a conservative cutoff of 2 for the ratio was used but scored as absent using a cutoff of 1.5. For these genes, overall low signal intensities were obtained under our hybridization conditions, very likely due to short sequence lengths and/or low G-C content of these genes, making scoring errors more likely.
Hybridization. Fluorescent labeling of DNA and competitive hybridizations were performed as described by Salama et al. (15), with several modifications. Whole-genomic DNA was prepared from blood agar plate-grown H. pylori with QIAGEN Tip-100 columns. Labeling of the DNA with aminoallyl-dUTP was performed with the BioPrime DNA labeling system (Invitrogen) minus the deoxynucleoside triphosphates (dNTP) set. A dNTP set was made using 0.5 mM dGTP, dATP, dCTP, 0.3 mM dTTP (Amersham Pharmacia), and 0.2 mM aminoallyl dUTP (Sigma). One microgram of DNA was diluted to 24 µl in water, 20 µl of random octamer-primer solution was added, and they were heated for 5 min at 99°C. The DNA was then placed directly on ice, 5 µl of dNTPs and 1 µl of high-fidelity Klenow enzyme were added to the reaction mixture, and the reaction was incubated at 37°C for 1 h. The reaction was stopped with 5 µl 0.5 M Na2EDTA, pH 8.0. Purification was done by a modified QIAGEN PCR purification protocol. PE and AE buffers (QIAGEN) were replaced with 5 mM KPO4 (pH 8.0)-80% EtOH and 4 mM KPO4 (pH 8.5), respectively. The Cy3 and Cy5 dyes (Amersham Pharmacia) were diluted using 72 µl of water, and 4.5 µl was aliquoted and dried before storage in the dark at 4°C. Labeling of the probe with the Cy dyes followed the protocol by Salama et al. (15) with the exception that before hybridization the Cy3 and Cy5 probes were combined and 100 µg of yeast tRNA was added. After being dried, the probe was resuspended in 32 µl of hybridization buffer (50% formamide, 6x SSC [1x SSC is 0.15 M NaCl plus 0.015 M sodium citrate], 0.5% sodium dodecyl sulfate, 50 mM Na phosphate [pH 8.0], and 5x Denhardt's solution). The probe was heated for 2 min at 99°C and applied to the slide. Microarray scanning and data processing were preformed as previously described (11, 18).
Empty-site PCR and sequencing. All loci where the microarray analysis indicated that loss or gain of genes had occurred were further characterized by PCR and sequence analysis. The loci were amplified by PCR with primers targeting genes that flank the deletion-acquisition event. PCR products were purified and sequenced as previously described (19).
Southern hybridization. Loss of insertion (IS) element copies was verified by Southern blot hybridization under high-stringency conditions (16). Genomic DNAs of the LSU1062 strains and the pair NQ315-NQ1712 were digested by EcoRI and electrophoretically separated on a 1% agarose gel. A 600-bp fragment of HP0414, which is part of the IS606 element, labeled with digoxigenin-dUTP was used as the probe.
Accession numbers. All new sequences generated in this study have been submitted to the GenBank/EMBL/DDBJ databases (accession numbers AM086402 to AM086418).
|
|
|---|
A signal ratio of 1 indicated the presence of the gene in both strains. Signal ratios of <0.5 or >2 were interpreted as evidence of potential loss or gain of the respective gene. When both channels yielded signal intensities of <350, the gene was excluded from the analysis, the most likely reason being the absence of the gene in both strains. This cutoff for missing genes was determined empirically, with data from comparative genome hybridizations of either strain LSU2003-1 or strain LSU1010-1, with 26695 and J99 on the same microarray (9) being used for calibration.
26695 and J99 have 1,590 and 1,495 predicted genes, respectively, of which 96% were represented on the array. An average of 1,453 spots were detected per tested genome pair. Assuming that they have a genome size similar to those of J99 and 26695, this indicates that the great majority of genes contained in the NQ and LSU strains were assayed by our array system.
In three of seven of NQ strain pairs and 9 out of 14 of LSU strain pairs, microarray hybridizations did not yield evidence of single-genome content difference. These pairs were not investigated further. In the remaining nine pairs, microarray hybridizations indicated potential gene content differences at one or more loci.
To validate the results of microarray hybridizations and to further analyze the events underlying the genomic changes, "empty-site PCRs" were designed for all changes indicated by the microarray hybridizations and sequenced. Primers were designed based on the sequences of genes flanking the predicted genomic events. The validation experiments confirmed the predicted changes in six cases (Table 1; Fig. 1 and 2A). An additional two events indicated by the microarray experiments could be explained by homologous recombination events, which changed the degree of sequence homology to the DNA spotted on the microarray but involved no loss or gain of sequence (Fig. 2B). Analysis by PCR and sequence analysis did not confirm five of the changes that were predicted from the microarray results, because the genes were found either to be absent from both strains or not to show any sequence differences. Only those events validated by sequencing were included in the subsequent calculations.
|
View this table: [in a new window] |
TABLE 1. Comparison of the numbers of genetic changes identified by multilocus sequence analysis (6) with the numbers of genomic changes identified by microarray hybridization and sequence analysisa
|
![]() View larger version (21K): [in a new window] |
FIG. 1. Genomic differences between paired sequential isolates of H. pylori. The line marked "consensus" represents genes present in both strains. Genes and/or sequences only present in one strain are indicated above and below the consensus lines. Mosaic sequences are indicated in red, the coding sequence is shadowed in dark gray, and the noncoding sequence is shown in light gray. (A) Loss of the complete cag PAI in strains NQ315/1712. The cag PAI was deleted at a 31-bp repeat. The sequences flanking the repeats were identical in both strains. (B) Partial loss of the cag pathogenicity island in the LSU1062 pair. The earlier strain contained an IS606 element at the 3' end of the cag PAI. The later strain lost half of the 5' end of gene HP0527, genes HP0528 to -0548, and most of the IS606 element. See the text for details. (C) Uptake or loss of the restriction-modification system HpyAIV in strains LSU1016. The RM system occurs in place of a putative open reading frame of unknown origin (green; the orf gene is indicated by an arrow). The genome alteration was mediated by a recombination event. (D) Incorporation and/or loss of a putative iron binding protein (ceuE) gene in NQ315/1712. The change in copy number of the ceuE gene was mediated by recombination. (E) Partial loss of pseudogenes (marked by asterisks) HP0903/0904 in strains LSU1040. The earlier strain contained a partially deleted gene (HP0903) and a complete gene (HP0904), whereas the deletion of gene HP0903 was longer in the later strain and included part of gene HP0904. The sequence showed a recombination event of at least 600 bp in gene HP0904. The sequence of the later strain was 1,000 bp shorter than for the earlier strain.
|
![]() View larger version (18K): [in a new window] |
FIG. 2. Sequence mosaics created by recombination events during chronic colonization with H. pylori, leading to gain or loss of sequence (A) or to allelic replacement only (B). Numbers on the left indicate the strain pair and the genes affected by the event. The gene numbers refer to H. pylori strain 26695. Sequenced regions are indicated by rectangles. Empty rectangles represent sequence identity between the two sequential strains. Vertical lines indicate the positions of polymorphic nucleotides. Below the rectangles, the lengths of sequence mosaics and of the flanking regions with identical sequence are indicated. (A) Characteristics of mosaic sequences adjacent to recombination-mediated genomic changes. Two events showed a flanking mosaic only on one side of the gene deletion-acquisition event. The column marked "net sequence difference" indicates the difference between the lengths of sequences between the regions depicted. See the legend to Fig. 1 for details of these gene deletion-acquisition events. (B) Characteristics of allelic replacement events without gain or loss of sequence.
|
A partial loss of the cag island was observed in a second pair, LSU1062 (Fig. 1B). The earlier strain, LSU1062-1, contained a complete cag PAI that was followed by a complete IS606 element, consisting of tnpA and tnpB homologs. The deletion in LSU1062-3 started in the "middle repeat region" of gene HP0527 (cagY) and comprised all remaining cag island genes (HP0528 to -0548), as well as part of the IS606 element (tnpA and the first 192 nucleotides of tnpB).
Changes involving IS elements and restriction-modification systems. Within the LSU1062 pair, the later strain showed evidence of having lost its only copy of the IS606 element. The 26695-J99 whole-genome microarray contains several spots that hybridize with components of IS elements (e.g., the tnpA and tnpB homologs HP0413/0414, HP1007/1008, and JHP826/827). In the LSU1062 pair, none of these spots showed hybridization with the later strain, in contrast to the earlier strain. To ascertain that no other copies of IS elements were still present in the genome of the later strain, a Southern blot hybridization was performed, which confirmed that the later strain did not contain any copy of these IS elements. Southern blot hybridization also showed that an IS606 element was lost in the strain pair NQ315/1712. The location of this event could not be determined.
The later strain of the LSU1016 pair contained the restriction-modification system HpyAIV (HP1351/1352), of which only small fragments were present in the earlier strain, interrupted by a putative open reading frame of unknown function. In the later strain, this region contained the complete restriction-modification system (Fig. 1C).
Changes involving the plasticity region. In addition to the cag PAI, a region termed the "plasticity zone" has been described as harboring the majority of strain specific genes (2). Microarray hybridization indicated that 14 plasticity region genes (HP0990 to -0995 and JHP929 to -936) were present in the later strain of the LSU1014 pair that were absent from the earlier strain. The event underlying this change could not be further characterized by empty-site PCR, because the region flanking the acquired genes was too different from strains 26695 and J99. The microarray results were therefore verified by single-gene PCRs for three selected genes (HP0995, JHP933, and JHP934), which were positive only in the later strain and therefore confirmed the microarray results in all cases.
Changes involving housekeeping genes. Microarray hybridization indicated a genomic difference affecting the ceuE genes (HP1561 and HP1562) in pair NQ315/1712. PCR analyses showed that, in fact, the earlier strain NQ315 contained a single copy of ceuE, while NQ1712 contained two paralogous copies arranged in tandem, which is similar to the situation in the sequenced strains 26695 and J99, where the two gene copies have >90% nucleotide identity (Fig. 1D). Mosaic sequences were identified upstream and downstream of the ceuE genes, suggesting that the fragment containing one ceuE copy present in the earlier strain had been replaced by a fragment containing two copies in the later strain or vice versa.
The microarray data indicated absence of a cluster of pseudogenes (HP0903/0904) that contains truncated homologs of an acetate kinase gene (ackA; HP0903) and a phosphotransacetylase gene (pta; HP0904) in the later strain of pair LSU1040. Empty-site sequence analysis showed that HP0903 was truncated in both paired strains (Fig. 1E). In the later strain, a recombination event had caused an even larger truncation of HP0903 and also the partial deletion of gene HP0904. The total size of the deletion was approximately 1 kb. A mosaic sequence was only present downstream of the deletion.
|
|
|---|
4,000 for all 21 strain pairs. Thus, genetic changes associated with gain or loss of sequence were 650 times rarer than simple homologous replacement events. In three of six cases, the later strain had fewer genes than the earlier strain; in the other three cases, it had more. However, we cannot be certain whether each of these events were a deletion or an acquisition, because the event may have occurred before the isolation of the first strain, with both variants cocolonizing the stomach, when the first sample was taken. Indeed, samples were taken at intermediate time points for a subgroup of pairs (Table 2), which in one case (LSU1014) showed both strains coexisting during the observation period.
|
View this table: [in a new window] |
TABLE 2. Genotypes of intermediate sequential isolates compared to those of first and last straina
|
The observed deletions could have been produced by intrachromosomal events or by import of empty-site alleles from unrelated strains. In the latter case, the import should have resulted in mosaic sequences flanking the deletion. Surprisingly, in most cases, mosaic sequences were only identified on one side of the deletion (Fig. 1 and 2). The mechanism responsible for this phenomenon is not known. No mosaic sequences were found in the pair that deleted the cag PAI, indicating that recombination between the imperfect copies of the 31-bp repeat were sufficient to delete the island, presumably by a RecA-dependent mechanism, similar to an event described by Björkholm et al. (3).
It is widely assumed that H. pylori uses genetic variation to adapt to individual hosts and specific niches within a host. While some variation can be generated in the absence of mixed colonization (e.g., by point mutations, slipped-strand mispairing, or intrachromosomal recombination), import of DNA fragments from an exogenous H. pylori strain appears to be the principal source of allelic variation, as well as genomic changes in H. pylori. Most of these events involve no loss or gain of sequence and are likely to have phenotypic and fitness effects that are subtle and difficult to detect.
The effectiveness of recombination as an adaptive strategy depends critically on the prevalence of H. pylori in a population. H. pylori is likely to have evolved in a situation when infection was quasi-universal, permitting the panmictic exchange of genes and alleles. With declining prevalence of H. pylori, coinfections are likely to become much rarer, making genetic exchange ineffective as a means of genetic adaptation, which in turn may further accelerate the disappearance of H. pylori from certain populations.
This work was supported by grants PTJ-BIO 031U213B from the Bundesministerium für Bildung und Forschung competence center PathoGenoMik and SFB479/A5 from the Deutsche Forschungsgemeinschaft to S.S. and by DFG grant Jo 344/2-1 to C.J.
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»