| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
,
Human Biology Division, Fred Hutchinson Cancer Research Center, Seattle, Washington,1 Department of Microbiology, University of Washington School of Medicine, Seattle, Washington,2 Unidad de Investigación en Enfermedades Infecciosas, Hospital de Pediatria, IMSS, Mexico City, Mexico,3 Wolfson Digestive Diseases Centre and Institute of Infection, Immunity and Inflammation, University of Nottingham, Nottingham, United Kingdom,4 Department of Medicine/Gastroenterology, Michael E. DeBakey Veterans Affairs Medical Center, Baylor College of Medicine, Houston, Texas5
Received 2 November 2006/ Accepted 27 February 2007
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
There is diversity within the bacterial population in single infected hosts. Molecular fingerprinting studies using RAPD (57), amplified fragment length polymorphisms (AFLP) (40), and sequence analysis of virulence-associated or housekeeping genes (20, 46) with multiple single-colony isolates or cultures obtained from separate biopsy samples from single patients revealed that strains isolated from single patients were closely related but that there were subtle differences in the majority of patients. The variants have even been referred to as quasi-species of a single strain (35). A smaller number of patients (10 to 20%) have shown evidence of a mixed infection, defined as clones from a single patient that are more similar to isolates from other patients than they are to each other. Interestingly, strains isolated at the same time or after 6 months to 10 years have shown similar divergence (11, 35, 40, 41, 46, 57). Thus, while populations may diverge within the host, divergence appears to occur slowly.
Multilocus sequence typing (MLST) has been particularly useful for examining the mechanisms contributing to variation within the H. pylori population. Analysis of homoplasy using individual gene sequences from a large collection of isolates revealed that recombination between H. pylori strains is more frequent than recombination between strains of most species, so that gene loci on the single bacterial chromosome are in linkage equilibrium (48). Only isolates obtained from related family members showed signs of being clonal. A subsequent study in which the workers compared gene sequences at 10 loci from participants in antibiotic eradication trials and in which paired isolates were obtained at a mean interval of 1.8 years indicated that recombination, not mutation, explained most of the variation observed. Furthermore, strains obtained from the same individual differed in 3% of the genome (20). One disadvantage of MLST is that it queries only a limited portion of the genome (a fraction of a percent). Interestingly, comparative genomic hybridization using whole-genome microarrays (array CGH) has revealed that the presence of 25% of the genes in the genome of H. pylori strains varies (25, 44), and even within an infected individual the presence of 3% of the genes in isolates can vary (28). Recombination has been proposed to be the mechanism for gene acquisition and loss, although this has been convincingly demonstrated in only a few cases (30, 34, 46). When the recombination events occur remains an open question.
In several studies workers have investigated transmission of H. pylori strains within families (27, 42, 54). These studies suggest that children acquire strains most frequently from their mothers, but they also acquire strains from other family members. Additionally, multiple strain variants and recombinants have been observed in children. This raises the possibility that there is no stringent bottleneck during transmission, meaning that children may be infected with a population of bacteria, possibly from multiple sources. Alternatively, there may be a transmission bottleneck, but strains then diverge by mutation and recombination. Infection of experimental animals has shown that colonization by multiple distinct strains is possible and that strains do not undergo extensive recombination during relatively short times (1, 15, 49, 52). Additionally, serial passage in vitro or long-term infection of mice has not revealed measurable divergence of clones, arguing against the hypothesis that there is a high degree of genomic instability in H. pylori (37).
The aim of the present work was to examine the genetic diversity in the H. pylori populations colonizing the stomachs of single human hosts from a population with a high rate of infection, in both children and adults. We also searched for evidence of de novo genetic diversity generated during experimental transmission of a defined strain to new adult human hosts. Multiple single H. pylori colonies were isolated from different regions of the stomachs of patients and were analyzed by using sequence polymorphisms in the virulence gene vacA, RAPD-PCR, AFLP, and array CGH to quantify differences among isolates at the whole-genome level. For naturally infected adults and children, we observed ubiquitous colonization by an H. pylori population with multiple strain variants defined by genetic diversity in a limited number of genes. Mixed infection by multiple strains, defined by diversity in a much larger number of genes, was less frequent. In the case of mixed infection, we confirmed predicted phenotypic differences among the infecting strains and documented limited recombination between two strains. Consistent with animal infection studies, we obtained no evidence of genetic diversification shortly after experimental infection of adult human volunteers with a homogeneous strain. The presence of multiple related but distinct genotypes, even in children, suggests that multiple genotypes may persist during transmission from human host to human host.
| MATERIALS AND METHODS |
|---|
|
|
|---|
H. pylori isolation. Two biopsies from each site (antrum, corpus, fundus, and incisura angularis in adults and antrum and corpus in children) were taken from each patient. One biopsy from each site was cultured for H. pylori isolation, and the other was fixed and processed for histological analysis. For culture, biopsy samples were homogenized and inoculated onto Trypticase soy agar plates supplemented with 7.5% sheep blood. Cultures were identified by urease, catalase, and oxidase tests and Gram staining.
From the primary growth obtained for each biopsy site, six or seven single colonies were isolated and propagated; growth from single colonies was swept and suspended in a saline solution for DNA isolation as previously described (5), and the preparations were stored at 20°C until they were tested.
Reference strains. The following strains were used as controls: 60190 (= ATCC 49503) (cag pathogenicity island positive [PAI+], vacA s1a/m1), Tx30a (= ATCC 51932) (cag PAI, vacA s2/m2), and 84-183 (= ATCC 53726) (cag PAI+, vacA s1b/m1). DNA from control and test strains were included in each PCR assay. For coculture experiments G27 (14) and a PAI mutant derivative constructed by insertion of a Kan-SacB cassette (13) at bp 122 of the cag2 open reading frame (ORF) were used. For array CGH studies of isolates obtained after experimental challenge, Baylor challenge strain BCS 100 (= ATCC BAA-945) was used (24).
PCR genotyping for vacA and cagA. The method used for vacA signal sequence and mid-region PCR typing was a slight modification of the method described by Atherton et al. (6). The PCR conditions were 35 cycles of 94°C for 0.5 min, 56°C for 1 min, and 72°C for 1.5 min and a final extension at 72°C for 5 min. For cagA typing, two sets of primers were used (primers F1 and B1 and primers B7628 and B7629) (23).
RAPD-PCR fingerprinting. RAPD fingerprinting was performed as previously described (51), using the 1281 and 1254 oligonucleotides for priming. The PCR products were electrophoresed in 2.5% agarose gels, and the resulting DNA patterns were analyzed with an automatic image analyzer (Syngene, United Kingdom).
AFLP analysis. AFLP analysis was performed as previously described (21). Briefly, 5 µg of H. pylori DNA was digested with 20 U of HindIII, adapter oligonucleotides ADH1 and ADH2 were ligated to the DNA fragments, and fragments were PCR amplified for 33 cycles using primer H1-1. Products were separated on a 2% agarose gel.
Array CGH.
The microarray design and hybridization conditions used have been described previously (44). Each strain was examined by performing a two-color competitive hybridization with a reference sample. For the Mexican pediatric and adult patient isolates and strain BCS 100, the reference preparation used was an equal molar mixture of sequenced strains 26695 and J99, which were used to design the probes on the microarray (44). For the isolates obtained in the human infection experiment, the reference sample was the strain administered to the patients (BCS 100). Each isolate was analyzed on at least two microarrays, which generated four potential data points for each gene. Data points were excluded due to low signals, slide abnormalities, and a regression correlation of pixel intensities in each channel of <0.6. Only the genes for which at least two (and up to ten) measurements were obtained were analyzed. Data were normalized using the default-computed normalization of the Stanford Microarray Database (22), and the mean of the log2(red channel normalized net intensity/green channel net intensity) (log2RAT2N) was computed. Data were also not included if the standard deviation of the log2RAT2N was greater than 1.0. For analysis of the Mexican isolates and BCS 100 a constant cutoff for absence of a gene was defined as a log2RAT2N value of 1.0 based on test hybridizations (28). Data were simplified into a binary score, analyzed with CLUSTER (http://bonsai.ims.u-tokyo.ac.jp/
mdehoon/software/cluster/) (16), and displayed with TREEVIEW (http://rana.lbl.gov/EisenSoftware.htm) (19). For the human challenge experiments, the GACK program (http://falkow.stanford.edu/whatwedo/software/) (31) was used to determine the divergence of genes from the genes of the starting strain because the BCS 100 strain used as the reference did not hybridize optimally with array probes in order to look for gene loss more stringently. Only genes present in BCS 100 were considered because a low signal in the reference channel resulted in misleading log ratio data. The complete data sets are available in Tables S1 and S4 in the supplemental material. The raw data are available at http://genome-www5.stanford.edu.
PCR confirmation of microarray results. Fifty nanograms of genomic DNA of each isolate was used in a PCR mixture containing 0.2 mM deoxynucleoside triphosphates and 0.2 mM primer DNA. The conditions used for amplification were one cycle of 1 min at 94°C, 30 cycles of 30 s at 94°C, 30 s at 48 to 56°C, and 2 to 4 min at 72°C, and one cycle of 5 min at 72°C. The primer sequences are shown in Table S2 in the supplemental material. In some cases, PCR products were sequenced using Big Dye sequencing reagents (ABI) and primers used for amplification by the FHCRC genomic resource. Sequences were aligned using the Sequencher software (version 4.5) to elucidate the genomic sequences between conserved ORFs. These sequences were compared to the sequences of reference strain ORFs using the NCBI BLAST server (http://www.ncbi.nlm.nih.gov/BLAST/) and were aligned using MAFFT 5.8 (online version; http://align.bmr.kyushu-u.ac.jp/mafft/online/server/) to generate a ClustalW-like output.
Coculture experiments. The human gastric adenocarcinoma cell line AGS was maintained in the presence of 10% CO2 in Dulbecco modified Eagle medium supplemented with 10% fetal bovine serum (FBS). Cells were seeded at a density of 1 x 105 cells in 24-well plates. H. pylori strains were grown at 37°C overnight in 90% brucella broth supplemented with 10% FBS, harvested, and resuspended in DB medium (81% Dulbecco modified Eagle medium, 9% brucella broth, 10% FBS) at a density of 2 x 106 bacteria/ml, and 1 ml was used to inoculate each well. At each time, supernatant was harvested, centrifuged, and frozen for interleukin-8 (IL-8) analysis (Biotrak enzyme-linked immunosorbent assay system [Amersham Biosciences, United States]). To detect phosphorylated and total CagA, the remaining cells and bacteria in each well were lysed with 100 µl of 2x sodium dodecyl sulfate (SDS)-polyacrylamide gel electrophoresis sample buffer (0.25 M Tris-Cl [pH 6.8], 4% glycerol, 4% SDS, 0.001% bromphenol blue, 2% 2-mercaptoethanol). Samples were resolved by SDS-polyacrylamide gel electrophoresis and transferred to polyvinylidene difluoride membranes. To visualize phosphorylated CagA, membranes were probed with anti-phosphotyrosine PY20 anticorpus (BD Transduction Laboratories, United States), followed by a 1:10,000 dilution of goat anti-mouse-horseradish peroxidase (Amersham Biosciences). Immune reactive proteins were visualized using the ECL Plus Western blot detection reagent (Amersham Biosciences). To measure CagA, the same membranes were probed with a 1:10,000 dilution of anti-CagA polyclonal anticorpus pAs (4), followed by a 1:10,000 dilution of goat anti-rabbit-horseradish peroxidase (Amersham Biosciences), and were visualized using ECL Plus.
Statistics. The presence of the clade 1 genotype in the antrum compared with the rest of the stomach was evaluated with Fisher's exact test, and the extent of gene variation between children and adults was evaluated by the Mann-Whitney test using the InStat software (version 3.0; GraphPad Software, Inc., San Diego, CA). A P value less than 0.05 was considered significant.
Nucleotide sequence accession numbers. The sequences obtained in this study have been deposited in GenBank with accession numbers EF521122 to EF521132.
| RESULTS |
|---|
|
|
|---|
All H. pylori colonies isolated from the four stomach regions of seven of the eight adult patients and all pediatric patients had the same RAPD and AFLP fingerprints, suggesting that each patient was colonized by a single strain. In contrast, colonies from one adult patient (patient 259) had two different RAPD patterns, which were designated RAPD patterns A and B (Fig. 1). Strains with RAPD pattern A had the same AFLP pattern (AFLP pattern A), while strains with RAPD pattern B had AFLP pattern B (Fig. 1 and Table 1).
|
|
cagA. Three of the pediatric isolates were vacA s1m1 cagA+, and one was vacA s2m2
cagA. In the case of patient 259, who appeared to have two distinct strain populations, all the colonies in clade 1 (RAPD pattern A, AFLP pattern A) contained the vacA s1m1 allele and were cagA+. All the isolates in clade 2 (RAPD pattern B, AFLP pattern B) were vacA s2m2
cagA (Table 1). Microarray genotyping identified previously recognized and new variable genes. To further explore the diversity of isolates in single patients, we used array CGH to compare the gene content of each isolate with the gene contents of two unrelated reference strains which have been fully sequenced (strains 26695 and J99). For this analysis we chose three adult patients with different strain populations; one of them appeared to have a homogeneous strain population, one of them appeared to have a mixed infection based on the appearance of two different RAPD profiles, and one of them appeared to have a mixed infection based on the presence of different vacA alleles. We analyzed 36 colonies and included isolates from all four biopsy sites. In addition, we analyzed two antral and two corpus isolates for each of the four pediatric patients (except patient 323, for which only a single corpus isolate was available for analysis) and examined a total of 51 clones. Of the 1,675 genes analyzed, 18 were not reliably measured and 7 were absent in all isolates, while 1,285 (78%) were uniformly present. The remaining 365 genes were differentially present or absent in one or more strains. These variable genes included 309 genes previously documented to be variable based on genomic microarray profiling of clinical isolates obtained from diverse populations worldwide (25, 29, 44). We identified 15 additional genes not previously documented to be variable in strains that were missing in a least two isolates from one to five patients (Table 2). Conversely, 165 genes found to be dispensable in a subset of strains previously were universally present in this collection of isolates (see Table S3 in the supplemental material). The complete data set for the 51 isolates is shown in Table S1 in the supplemental material, and raw microarray hybridization data are available from the Stanford Microarray Database (http://genome-www5.stanford.edu).
|
|
|
Sequencing across loci that were variable in clades revealed limited genetic exchange among strain populations during mixed infection. Although most genes that were variably present perfectly distinguished the two clades of patient 259, seven genes mapping to six chromosomal loci appeared to be variably present in both the clade 1 and clade 2 strain populations. The variable hybridization pattern might be explained by genetic exchange between the two strain populations, generating a mosaic pattern of gene presence. In order to test this hypothesis, we designed primers for the flanking universally present genes to amplify across these loci in each isolate. We successfully amplified DNA and sequenced the products from four loci representing five of the seven genes that were variable in the clades (Table 4). For one locus we obtained an amplification product that was the same size for all 16 isolates, and for three loci we obtained clade-specific amplification products; one locus was amplified only from clade 2 isolates, while for the other two loci we obtained products that were two distinct sizes and perfectly correlated with the clades to which the isolates belonged. We sequenced the PCR products for three isolates from each clade (a total of six isolates). In all but one case the sequences in a clade were identical. A comparison of the consensus sequences from the clade 1 and clade 2 isolates revealed numerous single-nucleotide polymorphisms (SNPs), as well as large insertions and deletions (Table 4). These findings support our array results showing that strains belonging to the same clade are closely related, while strains belonging to different clades are quite diverse.
|
|
Variation among genes that are variable in strains. We examined the distribution of variable genes in patients to identify the genes that were variable in the strain populations in individual hosts. Of the 365 genes variably present in our patient population, 171 were differentially present in patients but not in the isolates from a single patient (or in a single clade) (see Fig. S1 in the supplemental material). The remaining 196 genes were variably present in one or more patients (or clade). Most of these genes (136 genes) did not have an informative annotation based on the nucleotide sequence. The genes with putative functions included genes involved in DNA uptake, modification, or metabolism (22), genes encoding outer membrane proteins (5), and genes involved in cell envelope biosynthesis or modification (7). Many of the annotated genes belong to multigene families, such as the genes encoding outer membrane proteins. It is possible that these genes can be lost during infection simply due to functional redundancy with other proteins in the cell. Alternatively, there may be selective pressures in the changing host environment that drive the changes observed. The number of genes that varied within a particular host's population ranged from 24 to 67 (Table 3). The average pairwise difference in gene content for the intrahost populations ranged from 12 to 22 genes (Table 3).
We examined whether any genes distinguished adult and pediatric patients. Three of the genes that distinguished patients exhibited a reciprocal relationship in the adult and pediatric patients. A hypothetical gene present in the J99 sequenced strain (JHP0587) and another hypothetical gene present in all three sequenced strains (HP0688/JHP0628/HPAG1_0671) were present in all of the adult isolates and in none of the pediatric isolates. A gene encoding a putative type III restriction modification system methyl transferase (HP1522/JHP1411/HPAG1_1393) that exhibits phase variation (45) was present only in pediatric isolates and not in the isolates from adults. There was no statistical difference in the extents of genetic variation observed in adults and in children when either the total number of variable loci (P = 0.38) or the pairwise difference between isolates (P = 0.34) was considered.
We also examined whether gene content varied by anatomical site within the stomach. Clustering based on all the variable genes did not indicate that there was a closer relationship among strains from the same biopsy site. Although isolates from the same biopsy site sometimes grouped together, they often did not (Fig. 2). For example, for patient 323 isolates a2 and c5 from the antrum and corpus, respectively, had gene complements that were more similar to each other than to the gene complement of the other isolate from the same anatomical location. The same was true for isolates a1 and c1 from patient 251 and isolates a6 and c10 from patient 612. We examined on a gene-by-gene basis whether gene presence or absence correlated with anatomical site, but we found no genes for which the data approached statistical significance.
cag PAI genes were functional in a clade 1 isolate. The distributions of the two strain populations in the stomach of patient 259 were different. Both clades were found in the fundus, corpus, and incisura angularis of the stomach, but in the antrum only clade 1 bacteria were isolated. A Fisher's exact test comparing the presence of the clade 1 bacteria and colonization of the antrum with the presence of the clade 1 bacteria and colonization of the rest of the stomach yielded a P value of 0.0337, suggesting that the clade 1 bacteria outcompeted clade 2 bacteria in the antrum but not in the rest of the stomach. These two strain populations differ at more than 100 gene loci and, importantly, differ at two loci previously implicated in enhanced virulence, the cag PAI and the vacA cytotoxin loci. Since the patient suffered from duodenal ulcers, we wondered if the clade 1 bacteria had a higher pathogenic potential and if the type IV secretion system encoded by the cag PAI actively induced host cell IL-8 production and translocation of the CagA effector protein.
To determine whether the genes of the cag PAI in clade 1 isolates were functional, we cocultured AGS cells with an isolate belonging to each clade from patient 259. Clade 1 bacteria induced levels of IL-8 secretion that were higher than the levels induced by a control strain having a mutation in the PAI (Fig. 4A). Additionally, this strain translocated CagA protein into host cells, allowing tyrosine phosphorylation (Fig. 4B). The clade 2 strain, in contrast, induced low levels of IL-8 secretion (Fig. 4A) and expressed no detectable CagA protein (Fig. 4B).
|
| DISCUSSION |
|---|
|
|
|---|
In this study we investigated the genetic diversity of host bacterial populations in a Mexican population with a high rate of infection (80%). This Mexican population had a higher probability of mixed infection with multiple strains than other populations with lower infection rates, such as populations in the United States and Western Europe. For 11 of the 12 naturally infected patients analyzed, both the RAPD and AFLP approaches indicated that the strains isolated from a host were closely related. Array CGH analysis showed that >95% of the gene loci were highly conserved in strains from the same patient. Although the array CGH results support the hypothesis that the subjects were infected with a population of closely related strains, they also revealed evidence of widespread limited genetic divergence in the strain population in each patient. In single patients, the presence of 24 to 67 genes in single-colony isolates varied. A similar array CGH analysis of isolates from a single United States patient revealed variability in 44 gene loci (28). In a second study the researchers found evidence of genetic differences between pairs of isolates from four of seven (57%) Columbian patients and 5 of 14 (36%) American patients using array CGH (34). In our study we observed a higher frequency of genetic differences between pairs of isolates obtained from individuals from Mexico (100%), but the number of genetic loci affected was similar to the number reported previously (24 to 67 loci versus 44 loci). The higher frequency reported here may have resulted from the fact that more isolates were analyzed for each patient.
We initially considered two extreme models for the generation of a genetically diverse population in an infected individual's stomach. The first model posits a bottleneck during transmission, where a single clone or a few clones establish infection. The initially homogeneous population diversifies over time by mutation, and the genetic changes spread through the population by subsequent recombination between the naturally competent bacteria. Thus, considerable genetic variation could accumulate when the population is sampled in late adulthood, when H. pylori-related symptoms often present and after thousands of doublings have occurred. Alternatively, a second model suggests that children may be infected with a diverse collection of strains from unrelated donors. The diverse strains then homogenize over time via genetic exchange, again due to H. pylori's natural competence and efficient recombination machinery. Evidence which supports this model comes from recent studies with another population in which the endemic infection rate is high (17). Both models predict a difference in the extents of genetic variation in the bacterial populations present in the stomachs of adults and children. In our study, however, we found that the amount of genetic variation was independent of patient age; children as young as 5 years old and adults as old as 81 years old showed comparable variations in gene content.
It is possible that we did not observe a difference in genetic variation between children and adults because we did not sample children that were young enough. It has been proposed that upon infection of a new human host, mutation and/or recombination may be induced, which results in diversification (37). Once the infection is established, however, the potentially dangerous genomic instability might be down-regulated, allowing the population to stably persist. Since our youngest patient was 5 years old and the infection may have been acquired as early as 6 months of age, it is possible that accelerated diversification had already occurred. To address this possibility, we analyzed isolates obtained in a human challenge study performed with the homogeneous strain BCS 100 at 15 and 90 days postinfection (24). The adult volunteers were not related to each other and were not related to the patient from which the donor strain was obtained. Therefore, if host-specific differences select for genetic changes in the bacteria, we expected that such conditions were present during this experiment. In our array CGH and sequencing analyses we found no changes in gene content or sequence divergence up to 3 months after transmission.
Transiently superinfecting strains can donate genetic material that persists in the chronically infecting strain population and contributes to genetic diversity after the initial colonization event. The well-documented incidence of recurrent infection after antibiotic eradication in adults and children suggests that even adults continue to be exposed to H. pylori, at least in regions where the levels of infection are high (26, 47, 56). One patient in our study exhibited heterogeneity with all three tests for macrodiversity. We concluded that there were two distinct populations in patient 259 (designated clade 1 and clade 2) because in contrast to the 24 to 67 genes that varied in most patients, we observed 178 genes that varied, and 113 of these genes perfectly distinguished the two strain groups. Interestingly, within each clade the amount of gene variation was similar to the amount of gene variation observed in individual patients (36 genes). This suggests either that both the strain populations coexisted and diversified in this individual for some time or that this individual was recently infected with a second population of strains.
We were particularly interested in whether there was genetic exchange between the two strain populations since superinfection is a source of genetic variation via recombination of genomic DNA taken up by natural transformation. The presence of so many genes that perfectly distinguished the two clades argues against the hypothesis that there was a high rate of exchange between these two populations in spite of the fact that they were found to coexist at three separate biopsy sites. We sequenced four loci that showed variable hybridization within both clades and that have highly variable sequences in the three reference strains of H. pylori that have been sequenced. Consistent with a clonal genetic structure for each clade population, sequencing of approximately 6,000 bp of genomic DNA from each of six different isolates revealed identical clade-specific sequences with one exception. In the one exception, the polymorphisms observed are best explained by a recombination event with genomic DNA from an isolate belonging to the other clade population. Fortuitously, this recombination event resulted in an altered temperature requirement for PCR amplification of the region, which allowed us to ascertain that the recombination event occurred in only one of seven clade 2 isolates. The low frequency of recombination events observed raises the possibility that there are barriers to genetic exchange. The two strain populations differed in the presence of nine putative restriction-modification genes, which may constitute a very significant barrier to genetic exchange.
Polymorphisms that are best explained by a recombination event have been identified many times by other workers and indeed likely explain the variable loci that we observed in the host strain populations. The source of the recombinant sequences, however, usually cannot be identified. In our study, the most parsimonious explanation is that the recombinant sequence came from the second strain population in the host. Interestingly, one previously documented case of a recombination event between two strain populations in a single infected individual defined at the sequence level involved the cag PAI, a highly variable genomic region (30). The locus where we observed a recombination event also has a highly mosaic structure in the sequenced strains, as well as a number of clinical isolates (12). Thus, targeting such regions in addition to the housekeeping and virulence genes commonly used in MLST studies should help workers characterize the generation of diversity in H. pylori strains.
Most of the patients were colonized with a single strain population, but how did one patient sustain two largely independent strain populations? Among the genes that distinguish clades 1 and 2, the cag PAI genes stand out. Strains positive for the cag PAI are associated with peptic ulcers and gastric adenocarcinoma and induce a variety of cellular phenotypes during coculture with mammalian cells (38, 50). Only the clade 1 strain, which had a functional cag PAI, was found in the antrum and was presumably associated with the development of duodenal ulcers in this patient. Although we cannot exclude the possibility that clade 2 bacteria were simply missed in the antrum due to inadequate sampling, our data suggest that the ratio of the bacteria belonging to the two clades at this biopsy site was skewed compared to the ratio in the rest of the stomach.
The exclusive colonization of the antrum by clade 1 may indicate that this clade is more fit than clade 2 for colonization of this region. Colonization of a single human host by distinct populations of cag PAI+ and cag PAI strains was documented in a previous study (30). Interestingly, in this study the workers also observed one type of strains exclusively in the antrum, but in this case both cag PAI+ and cag PAI variants of this strain type were observed. The results of another study using a Mongolian gerbil infection model suggested that the colonization loads of isogenic strains having mutations in either cagA or other cag PAI genes were reduced in the corpus but not in the antrum (43). Together, these data make it very unlikely that the cag PAI is solely responsible for tropism to a particular region of the stomach, although it may affect colonization levels. Although it is not clear whether bacterial genotypes determine infection patterns (36), some strains have a specific and consistent tropism during animal infections (1). We were not able to find any genes that were present or absent exclusively in isolates obtained from a particular region of the stomach, although such negative results might have been expected given our small sample size.
Although the mechanism of H. pylori transmission in the human population has not been firmly established, vertical transmission within families with a bias toward maternally generated infection has been suggested. We suggest that a population of strains can persist during transmission, but this population has restricted diversity, presumably due to geographic and genetic isolation and selection in previous hosts. Further tests of whether diverse populations persist during transmission will require studies of multiple single-colony isolates from infected mothers, fathers, and children and will be technically challenging since endoscopy of healthy individuals would be required to obtain such samples. Such studies, however, might reveal genes required for transmission but not for persistence by identifying genes that are more frequently variable in adults than in children. In our study we identified a few genes that were present only in adults or children, but because of our limited sample size we could not draw any specific conclusions about these genes. The possibility of transmission with a mixed population could have important implications for vaccine design as some protein targets that have been examined (e.g., VacA and CagA) are encoded by genes that are among the genes that are often found to be variable in isolates, both between and within infected individuals. Additionally, molecular diagnostic methods will need to ensure adequate sampling of the bacterial population within individuals to determine the bacterial contribution to disease risk. Finally, it would be very interesting to determine how and why the variable loci (which include determinants of pathogenicity) are maintained in the population when they can clearly be lost or mutated in a subset of isolates. Analysis of infections with divergent strain populations in vitro and in animal models may begin to address these questions.
| ACKNOWLEDGMENTS |
|---|
We thank Marion Dorer for helpful discussions and for critical reading of the manuscript.
| FOOTNOTES |
|---|
Published ahead of print on 2 March 2007. ![]()
Supplemental material for this article may be found at http://jb.asm.org/. ![]()
| REFERENCES |
|---|
|
|
|---|