| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
,
Feng Luo,
Sergio Lizano, and
Debra E. Bessen*
Department of Microbiology and Immunology, New York Medical College, Valhalla, New York 10595
Received 16 August 2006/ Accepted 29 September 2006
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Numerous epidemiological studies report differences in the strains of S. pyogenes recovered from oropharyngeal infection (pharyngitis) versus superficial skin infection (impetigo) (4, 13, 15, 16, 18, 34, 43, 49, 65). Markers used to identify distinct strains were often based on antigenic forms of the M surface protein (M serotypes). The epidemiologic observations led to the recognition of distinct tissue tropisms among members of the species. More recent analysis of the emm genes encoding M proteins led to the definition of three emm genotype patterns, based on the number of emm genes, their ancestral lineages, and their chromosomal arrangements (29, 30).
Multiple isolates of most emm types studied are restricted to a single emm pattern genotype. Among 495 isolates represented by 160 emm types, and including numerous isolates of the more common emm types, only two emm types were observed in association with more than one pattern group (44) despite the fact that a given emm type can often be recovered from strains with distant genetic backgrounds. Thus, the emm pattern can be deduced from the emm type with reasonable accuracy. Population-based surveillance in temperate regions (the United States and Italy), where streptococcal pharyngitis is far more prevalent than impetigo, shows that the vast majority of isolates recovered from cases of throat infection have emm types characteristic of the emm pattern A-C or E genotype, with fewer than 1% of the isolates having pattern D emm types (17, 55). Likewise, nearly all isolates recovered from cases of acute pharyngitis in Mexico (22), Spain (1), and Germany (14) are of emm types characteristic of the emm pattern A-C or E genotype. In a population-based study in tropical Australia, where impetigo is endemic and both pharyngitis and throat colonization are rare, the majority of impetigo isolates had the emm pattern D (46%) or E (40%) genotype (9). In a recent population-based survey in Nepal, 19% of the impetigo isolates were emm pattern A-C (53), higher than the 13% of impetigo isolates from tropical Australia that were pattern A-C (9). A recent study in Ethiopia showed that none of the >60 impetigo isolates had emm types associated with the emm pattern A-C genotype (63).
The findings from population-based collections of isolates are largely consistent with initial observations made from a worldwide collection (12). Taken together, these data provide strong support for the idea that the emm pattern genotype can serve as a reliable genetic marker for tissue site preferences among S. pyogenes causing infection. Strains with the emm pattern A-C genotype tend to show a strong preference for throat infection (throat specialists), emm pattern D strains have a predilection for causing impetigo (skin specialists), and emm pattern E strains readily infect both tissues (generalists).
The observed emm pattern correlations with tissue site preference are restricted to cases of infection and do not necessarily extend to asymptomatic carriage (53). It is also important to emphasize that the association of the emm pattern genotype with throat versus skin infection reflects a general trend, and there are exceptions. For example, in the Ethiopia study, 28% of tonsillitis isolates were of emm types characteristic of emm pattern D (63). Although the relative rates of tonsillitis versus impetigo in Ethiopia are not known, there appear to be ample levels of both forms of disease; a problem inherent in assigning a particular isolate to throat infection is a streptococcal carrier state that coincides with pharyngitis or tonsillitis due to another cause (e.g., virus). Unusual or exceptional clones may also contribute to deviations from the general trend. Nearly all of the pattern A-C impetigo isolates from the Nepal (53) and tropical Australia (9) studies were dominated by just one or two clones, and the pattern A-C clones of both studies were recovered more often from impetigo lesions than from throat carriage.
T antigens provide the basis for another serological typing scheme, in which T and M types share some degree of concordance (33). At least some T antigens are cell wall-anchored surface proteins encoded by genes positioned within the FCT region of the S. pyogenes genome (10, 42, 45, 54). Included among the FCT region surface proteins are several microbial surface cell recognition adhesion matrix molecules (MSCRAMMs) that bind human fibronectin or collagen (24, 38, 39). Recent findings show that at least some T proteins form pilus-like appendages on the bacterial cell surface (42, 45). Thus, FCT region gene products of S. pyogenes are good candidates to have key roles in pathogenesis by mediating bacterial adherence to host tissue. In fact, strong genetic linkage is observed between the emm pattern genotype and rofA versus nra lineage-specific alleles of the transcription-regulatory locus that lies within the FCT region and regulates the expression of at least some of the FCT surface protein genes (10, 11).
The strong associations observed between particular S. pyogenes strains and certain forms of disease raise the possibility that throat-tropic strains share genotypes that distinguish them from skin-tropic strains. Alternatively, there may be a multitude of molecular strategies that are used to achieve the same disease phenotype. In this report, the FCT region gene contents are defined for >100 strains representing a broad segment of the S. pyogenes population. Genetic linkage among various combinations of FCT region genes is assessed, and the relationships between FCT region genes and emm markers for preferred tissue sites for infection are delineated.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Nucleotide sequence determination and analysis. The nucleotide sequence of the FCT region was determined according to previously described methods (10), using long overlapping PCR fragments and primer walking. The sequencing strategy included strands in both directions and resulted in at least twofold coverage for all regions. Contigs were assembled and sequences were analyzed for open reading frames (ORFs) using Lasergene software (DNASTAR, Inc., Madison, WI). BioEdit (http://www.mbio.ncsu.edu/BioEdit/bioedit.html) was used for sequence alignments via the Clustal W algorithm and for calculation of sequence similarity matrices.
PCR amplification-based mapping. PCR amplifications were performed using an annealing temperature of 55°C and extension times of at least 1 min per 0.8 kb of expected product. Two different sets of primers were used for four genes displaying high levels of nucleotide sequence heterogeneity (prtF1, prtF2, cpa, and fctA). For fctA, the second reaction used a forward primer specific for the 3' end of sipA2 and a reverse primer targeting the 5' end of srtC2 (designated a "bridge" reaction). All strains were tested two or more times for each reaction, except that the fctA bridge reaction was used only for strains that were negative for fctA but positive for sipA2 and srtC2. For prtF1, prtF2, cpa, and fctA, a strain was scored positive if at least one of the two reactions (each tested in duplicate) yielded a product.
The oligonucleotide primer pairs tested for 13 different PCR amplification reactions were as follows: reaction PrtF1-A (amplifying an
0.55-kb fragment of prtF1) forward, 5'-TGCGC GGGTT CTATC GGTTT TGGTC AAGTA-3', and reverse, 5'-AATTA GTTTT YTCAR WGCYT CACGC ATTAA-3'; reaction PrtF1-B (amplifying an
1.2-kb fragment of prtF1) forward, same as for reaction PrtF1-A, and reverse, 5'-CTCCG TCTCA CCAGA CTCAC CCGCT AGAGG TGATT GGTC-3'; reaction Cpa-A (amplifying an
0.75-kb fragment of cpa) forward, 5'-GGATA TGAGA TTGCC GAACC TATTA CTTTT AAAG-3', and reverse, 5'-GGAGC CTGTT TATCT TCCAT TCGAA TAATA TCCAC-3'; reaction Cpa-B (amplifying an
1.3-kb fragment of cpa) forward, 5'-GAAGG TGACT ACTCT AAACT TCTAG AGGGA GCAAC-3', and reverse, 5'-CCAGT TGGTG GGACA AGATC TTTWC GG-3'; reaction PrtF2-A (amplifying an
0.60-kb fragment of prtF2) forward, 5'-GCTGG TGCAA CTATG GAGTT GCGTG ATTCA TCTGG T-3', and reverse, 5'-CCAGT TGCTG GTAAA CTAGT ATTAC TCTTT GGC-3'; reaction PrtF2-B (amplifying an
1.2-kb fragment of prtF2) forward, same as for reaction PrtF1-B, and reverse, 5'-CCCTG GTTAT ACTGG TTGGA GTCCT TCTCT AG-3'; reaction SipA2 (amplifying an
0.46-kb fragment of sipA2) forward, 5'-GCTTT CATAC GGTTA GTACT TAAGA TTTCT ATTAT TGG-3', and reverse, 5'-CCTCT CACTC TTAAT AGAGT TGAGA TTTTC CC-3'; reaction FctA (amplifying an
1.0-kb fragment of fctA) forward, 5'-AAATT ATTAC TTGCT ACTGC AATCT TAGCA ACTGC-3', and reverse, 5'-CTCCA CCAAT AGCCA CAATG CTAAG AACTG CAAAT GGAGC-3'; SipA-SrtC bridge reaction (amplifying an
1.3-kb fragment that encompasses fctA) forward, 5'-GGGAA AATCT CAACT CTATT AAGAG TGAGA GG-3', and reverse, 5'-GGCTT TATTG ATAAC CTGTA CAATT GTCAT C-3'; reaction SrtC2 (amplifying an
0.72-kb fragment of srtC2) forward, 5'-GATGA CAATT GTACA GGTTA TCAAT AAAGC C-3', and reverse, 5'-CTTGA ATAGT ACCGA CAACG ATAAC ACGAT TGTCA G-3'; reaction FctB (amplifying an
0.53-kb fragment of fctA) forward, 5'-ATGTT ATTTT CTGTC GTAAT GATAT TAACC-3', and reverse, 5'-CTAGT AACCC CAGTA ATACG ATACT TAAGA TACCC-3'; reaction SrtB (amplifying an
0.74-kb fragment of srtB) forward, 5'-CTAAA ATAAT AGCTA TAACC ACCCC GAAAG CAGCA C-3', and reverse, 5'-CTAAA ATAAT AGCTA TAACC ACCCC GAAAG CAGCA C-3'; and reaction Sof (amplifying an
0.65-kb fragment of sof) forward, 5'-GTATA AACTT AGAAA GTTAT CTGTA GG-3', and reverse, 5'-GGCCA TAACA TCGGC ACCTT CGTCA ATT-3'.
To address the potential problem of either false positives or false negatives, several quality controls were implemented. Controls for the template DNA, deoxynucleotide triphosphates, and Taq polymerase always included PCR amplification of a housekeeping gene (usually xpt or mutS) (21) performed in parallel. In addition, for a given primer pair set, all PCR amplifications were performed in parallel with a positive control template DNA and often with a known negative control DNA. All reactions were performed at least twice on separate days. To score PCRs as positive or negative, the PCR products were confirmed for their predicted sizes by agarose gel electrophoresis using molecular size markers and scored only if they were clearly negative or strongly positive based on the intensity of ethidium bromide staining. Discordant findings (e.g., two trials giving different results or bands that were present but weak) led to multiple rechecks of that strain using template DNA that was freshly prepared by boiling colony picks grown from the frozen stock culture; this helped to eliminate confounding results due to cross-contamination of the DNA template with DNAs from other strains. Together, the controls helped to ensure that both false positives (due to DNA cross-contamination) and false negatives (due to amplification of the wrong gene, partially evidenced by gel migration; bad reagents, template, or primers; or failure of the thermal-cycler machine) were minimized.
Additional controls to help rule out potential false positives among the PrtF1-A, Cpa-A, FctA, and SipA-SrtC bridge reaction products, each of which exhibited high levels of sequence heterogeneity among alleles, included nucleotide sequence determination and multiple-sequence alignment of a large selection of products; findings of well-aligned sequences signified that the PCR products were part of the same gene family (44, 51) (data not shown).
Statistical analyses.
Linkage disequilibrium between pairs of genes was calculated using 2-by-2 tests for independence; data are reported for
2 analysis (two tailed) with Yates' correction (DnaSP 4.10). Unrooted phylogenetic trees were based on matrices of character states at several loci, defined as the presence/absence of a locus or a specified gene lineage, and were constructed by maximum parsimony (PAUP 4.0; Sinauer Associates). Diversity (D; Simpson's diversity index) among the S. pyogenes isolates was quantified according to methods that incorporate the number of distinct genotypes and their frequencies of occurrence (31), where a D value of 1.0 indicated that the genotype was able to discriminate between all isolates and a D value of 0.0 indicated that all isolates had identical genotypes. Confidence intervals (95%) were determined by the method of Grundmann et al. (26).
Nucleotide sequence accession numbers. The complete nucleotide sequences of the FCT regions of strains ALAB49, 29487, and D633 have been assigned GenBank accession numbers DQ984656, EF025060, and EF025061.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
In an earlier report, the FCT regions of five S. pyogenes strains were subjected to a detailed sequence analysis (10). An update to that analysis is presented here and includes all 13 emm type-defined strains whose complete FCT region sequences are known. Levels of sequence identity established by BLASTN (3), combined with relative positions within the FCT region, were used to help distinguish between alleles belonging to the same locus and alleles derived from distinct genes. Based on gene content and order, each FCT region was assigned to one of six major forms (Fig. 1). Genotypes FCT-1 through FCT-4 were previously described (10, 46, 60). The recently reported genome sequences of an emm4 and an emm2 strain (8) indicate that each of the two strains represents a new FCT region genotype arrangement, designated FCT-5 and FCT-6, respectively. Only a single strain was identified for each of the FCT-1, -2, -5, and -6 genotypes. The three emm pattern D impetigo isolates that underwent nucleotide sequence determination for this study were found to share the previously recognized FCT-3 form with several emm pattern A-C and E strains. In all, 9 of the 13 strains displayed the FCT-3 or FCT-4 form.
|
99 and 60%, respectively, for products of the nra and rofA lineage alleles (Table 1) (11). There are also several highly conserved gene products, exhibiting more than 95% amino acid sequence identity, that are shared among the nine strains of the FCT-3 or FCT-4 genotype. These include SipA2 (a putative signal peptidase), SrtC2 (a specialized sortase) (5), and MsmR (a transcriptional regulator) (46). Several distinct (putative) sortase genes were identified, and all six FCT region forms had at least one type of sortase gene (data not shown); however, aside from srtC2, only srtB is present in more than one FCT region form (FCT-1, -2, and -4) (Fig. 1). In contrast to sortase, putative signal peptidase genes are restricted to three FCT regions.
|
35% overall amino acid sequence identity (data not shown), although within a lineage, the gene products are highly similar in sequence (Table 1). The fbaB and pfbpI lineage alleles correspond to the FCT-3 and FCT-4 genotypes, respectively, with the exception of the emm49 strain, which has an FCT-3 genotype and a pfbpI lineage allele (Table 1). The latter finding is indicative of a history of genetic recombination within the FCT region following interstrain horizontal gene transfer events. The cell wall-anchored surface protein families of Cpa and FctA each display extensive amino acid heterogeneity. The magnitude of sequence similarity among Cpa proteins tends to be discordant with that found among FctA proteins (Tables 2 and 3). For example, the FctA proteins of strains D633 and MGAS6180 are 81% identical, whereas their Cpa proteins display only 37% identity; even their prtF2 genes belong to different lineages. Conversely, strains D633 and MGAS8232 have highly similar Cpa and PrtF2 proteins (98 and 94% identity, respectively), but their FctA sequences diverge by more than 35%. The finding on Cpa-FctA discordance may have biological implications for the direct molecular interactions that are observed between Cpa and FctA (42, 45). Furthermore, the Cpa-FctA discordance provides additional evidence for genetic recombination within the FCT region leading to the generation of mosaic-like structures.
|
T type and FCT region forms. Historically, T typing has been widely used as a serological typing scheme (32). It was recently shown that at least some T antigens are cell wall-anchored surface proteins encoded by FCT region genes (10, 42, 45, 54). The extensive amino acid heterogeneity observed for Cpa and FctA (Tables 2 and 3) is suggestive of strong diversifying selection acting on cpa and fctA, such as that imposed by the host immune response. The T serotypes were determined for isolates having FCT-1 through FCT-4 genotypes via trypsin treatment, followed by agglutination in the presence of T-serotype-specific antiserum. Multiple T types were evident among strains assigned to either the FCT-3 or FCT-4 genotype (Table 1). One T type (T3/13/B) was present among multiple strains (all FCT-3), in which Cpa and FctA proteins diverged in amino acid sequence by more than 20 and 35%, respectively. The magnitude of amino acid homology that defines each T type and the extent to which trypsin removes antigenic epitopes remain to be established through more extensive strain sampling.
|
70% of the
160 known emm types (http://www.cdc.gov/ncidod/biotech/strep/strepindex.htm), the near-complete range of FCT region genotypes within the species is likely to be represented. MLST based on seven housekeeping loci yielded 113 distinct sequence types (STs) and 113 unique combinations of emm type and ST. Each of the 113 isolates also differed from all others in the set at two or more of the seven housekeeping alleles, and 107 of the isolates differed from all others at three or more housekeeping alleles. Thus, the 113 isolates under evaluation are genetically distant from one another, and each can be regarded as a distinct strain. The FCT-3 and FCT-4 region genotypes were chosen as a starting point for comparison, since they accounted for most of the strains whose complete FCT region sequences are known (Fig. 1). Nine of the 10 FCT region genes were targeted for PCR amplification in order to ascertain their presence or absence among the set of 113 diverse strains. Since alleles of the prtF1, prtF2, cpa, and fctA genes display extensive sequence heterogeneity (10, 44, 64) (Tables 1, 2, and 3), two primer pairs were used for PCR-based detection of each gene (see Materials and Methods). All oligonucleotide primer pairs displaying BLASTN hits with an FCT region whose complete nucleotide sequence is known also yielded a PCR amplification product of the expected size using template DNA derived from the sequenced isolate or a related strain sharing the same emm type and ST (Table 1; see Table S1 in the supplemental material). Similarly, FCT regions in which one or both primers of a pair did not align via BLASTN also failed to yield a PCR product. The observed correspondence between in silico analysis and experimental findings provided further validation of the approach.
Detailed analytic PCR findings on 113 genetically distinct strains are presented in Table S1 in the supplemental material. Although the relative order of the FCT region genes was not established by this approach, correspondence to the FCT region genotypes outlined in Fig. 1 was assumed based on the presence or absence of individual genes. This assumption seems reasonably well justified, because there were no examples of gene order reversal within the FCT regions whose complete nucleotide sequences are known (Fig. 1).
The most prevalent FCT region genotypes uncovered by the PCR-based assessment were FCT-3 and FCT-4, representing 32 and 27% of the 113 strains, respectively (Table 4). The FCT-5 form ranked next in prevalence (17%), although correspondence to FCT-5 was characterized by an absence of many of the genes found in FCT-3 and FCT-4, and thus, probes designed to specifically target FCT-5 genes might reveal additional FCT region genotypes. A derivative of the FCT-4 genotype, in which only the cpa gene went undetected, was found among 12% of the strains; this combination of genes is designated the FCT-7 genotype (Table 4). Another derivative of the FCT-4 genotype, in which nra replaces rofA, is designated FCT-8, although this combination of genes was limited to just three strains. The FCT-1 genotype corresponded to eight strains. Both the FCT-6 and FCT-2 forms were rare among the sample set, found in association with only one and two strains, respectively.
|
In general terms, the probability of coinheritance increases as the length of the intergenic region available for crossover decreases. Although recombination can disrupt physical connections, strong linkage among genes within the FCT region may be the consequence of their very close proximity. All pairwise combinations of FCT region genes were assessed for linkage using two-by-two tests for independence, in which the presence or absence of one gene was compared to the presence or absence of the other gene. Pairwise comparisons showing highly significant, nonrandom associations between genes were taken as evidence for strong linkage disequilibrium. As expected for closely positioned loci, nearly all pairwise comparisons of FCT region genes (42 of 45; 93%) displayed evidence for strong linkage (P < 0.05) (Fig. 2). Applying the Bonferroni correction, which factors in the total number of pairwise comparisons, reduced the number of gene combinations displaying strong linkage to 32 (71% of the total).
|
Other examples of strong positive linkage among the 113 strains involve pairings of sipA2, srtC2, fctB, and prtF2 (Fig. 2). The fctA "bridge" PCR amplification (see Table S1 in the supplemental material) is designed to capture the most highly divergent fctA alleles; when the fctA bridge-positive strains were combined with the fctA-positive strains, additional gene pairs exhibiting strong positive linkage were identified. The sipA2, fctA, srtC2, fctB, and prtF2 loci are contiguous in the FCT-3 and FCT-4 genotypes (Fig. 1). In an emm49 strain (FCT-3), cpa plus four downstream locisipA2, fctA, srtC2, and fctBare transcribed as a single polycistronic transcript (48). A general property of operons is that genes are transcribed in a coordinated manner in order to fulfill a biological need, and thus, the operon itself can become the unit of selection (41). The operon structure could explain the strong positive linkage observed for the pairwise comparisons made with sipA2, fctA, srtC2, and fctB (Fig. 2). It may be that cpa is not included among this group because it has divergent forms that were not detected by the PCR-based method employed. In fact, exclusion of the FCT-7 genotype strains lacking cpa (Table 4) from the above-mentioned calculations yielded strong positive linkage between cpa and the downstream genes of the operon (data not shown).
The tight association of prtF2 with genes of the adjacent operon could be the result of either physical proximity on the chromosome (Fig. 1) or positive selection that arises from interactions among the gene products. In support of the latter possibility, Cpa, FctA, and PrtF2 are each required for the formation of the T-antigen complex (42). prtF2 and fctB comprise the only gene pair that exhibits complete positive linkage, where none of the 113 strains has one locus in the absence of another (Fig. 2). However, the pfbpI lineage prtF2 allele present in the FCT-3 region of an emm49 strain (Table 1), in place of the fbaB lineage allele that is more typical of sequenced FCT-3 forms, provides at least one example showing that recombinational replacements involving prtF2 alleles have occurred.
Of the many possible pairwise comparisons between FCT region genes (Fig. 2), 12 involve gene pairs for which nearly all 113 strains harbor at least one of the two genes. Of special interest are the four pairs involving genes encoding cell wall-anchored surface proteins. Nearly every strain has either prtF1 or at least one of four other surface proteins (cpa, fctA, fctB, or prtF2) (Table 4 and data not shown). Only three strains are devoid of all five of these surface protein genes, although they have other genes corresponding to pilus-like structures. Specifically, FCT-2 has orthologs of the cpa through fctB series of genes, whereas FCT-6 has strong partial homology to Streptococcus agalactiae streptococcal pilus genes (19, 40, 52). Although the cell wall-anchored surface protein genes are considered to be accessory loci due to their differential presence among strains, the data presented here suggest that possession of either prtF1 or one of the other four surface protein genes is critical to the long-term survival of most S. pyogenes organisms in their natural environment.
Diversity in FCT region forms among emm pattern-defined strains. The distribution of the eight FCT region genotypes within each emm pattern-defined subset of strains is summarized in Table 5. Statistical methods were used to compare the extents of diversity in FCT region genotypes among the emm pattern-defined subsets of isolates. The highest D value was observed for the emm pattern A-C strains, indicating that as a group, the so-called throat specialists display the most diversity in FCT region genotypes. The 20 emm pattern A-C strains are represented by six of the FCT region genotypes, with no single FCT region form corresponding to more than 25% of the strains. The lowest D value was found for the 38 emm pattern D strains, indicating that the so-called skin specialists are the most homogenous in their FCT region gene contents. The emm pattern D strains are distributed among five FCT region genotypes; however, the majority (79%) belong to the FCT-3 genotype. The 55 emm pattern E strains have an intermediate D value and correspond to seven of the FCT region genotypes, but most (75%) are restricted to two FCT region forms (FCT-4 and FCT-5). The 95% confidence intervals for the three emm pattern-defined groups indicate that each is significantly different from the other two groups in terms of diversity. If gene products of the entire FCT region are critical for infection at specific tissue sites, then the diversity measures support the notion that emm pattern A-C strains draw upon a variety of strategies to cause infection in the throat, whereas emm pattern D strains may be more dependent on a singular mechanism.
|
Five of the FCT region genes assessed by PCR encode cell wall-anchored surface proteins. Cpa, FctA, and FctB are associated with pilus-like appendages (42, 45). Cpa binds human type I collagen (38), and PrtF1 and PrtF2 each bind human fibronectin (28, 39). The strongest measure of positive linkage between the emm pattern-defined skin versus throat specialist strains and FCT region surface protein genes lies with cpa, which is present in nearly all (92%) emm pattern D strains and absent from 75% of emm pattern A-C strains (Table 6). However, the presence of cpa in several pattern A-C strains, particularly those of the highly prevalent classical throat types emm3, emm5, and emm18 (55), makes it unlikely that Cpa by itself directs S. pyogenes to the skin. Cpa may be necessary for most skin infections, but it does not appear to be sufficient. Otherwise, emm3, emm5, and emm18 isolates should be recovered from impetigo lesions in much greater numbers. It should be noted that cpa expression may be aberrant in the emm18 strain, which has a defect in its nra regulatory gene (56).
|
Three other surface protein genesfctA, fctB, and prtF2are also found in high numbers among emm pattern D strains, 92% of which have all three genes plus cpa. However, fctA, fctB, and prtF2 are also present among 50% of the emm pattern A-C strains (Table 6). Although it is possible that the fctA, fctB, and/or prtF2 gene is necessary for skin infection, they do not appear to be sufficient to cause disease at this site, because each is present in many of the throat specialists strains as well. As a group, emm pattern E strains are recovered in high numbers from both throat and skin sites of infection, so it might be expected that these organisms have virulence determinants directed toward both tissues. A high proportion (93%) of emm pattern E strains harbor prtF1 (Table 6). However, only 55% of the pattern E strains have cpa. Therefore, if cpa is a determinant for infection of the skin, it would appear that many pattern E strains may utilize an alternative molecular strategy.
The findings of this study provide supporting evidence that Cpa may, in large part, be necessary for S. pyogenes infection of the skin, whereas PrtF1 may be a key virulence determinant in the throat. However, there are several emm pattern D strains harboring prtF1 and several emm pattern A-C strains having cpa. Thus, virulence determinants in addition to PrtF1 may be required for throat infection. Likewise, Cpa may not be sufficient for skin infection. For those few pattern A-C strains lacking prtF1 and those few pattern D strains lacking cpa, there may exist alternative strategies for causing infection of the designated tissue.
Molecular epidemiologic findings often provide a sound basis for formulating testable hypotheses. In experimental studies, the role of cpa in the virulence of an emm pattern D strain was investigated using a highly sensitive, humanized mouse model of impetigo. The experiments show that cpa is essential for skin infection under certain growth conditions (42). Prior reports had demonstrated that three additional factors, whose genes lie outside the FCT region, were also required for maximal virulence at the skin (57, 59). Importantly, two of the factorssecreted cysteine protease and plasminogen-binding M proteinwere largely lacking among the emm pattern A-C strains. Thus, the population findings of this report, supporting a role of Cpa in skin infection as being necessary but not sufficient, confirms the combined experimental findings. A critical role for PrtF1 in throat infection remains to be established via a refined model of streptococcal pharyngitis.
Several investigators have screened numerous isolates, defined according to emm or M protein type, for a variety of FCT region genes (5, 25, 38, 47, 62), primarily prtF1, prtF2, and cpa. However, direct comparisons to data presented in this report are limited, because technical approaches with various degrees of sensitivity and specificity were employed and many emm types are found in association with distant genetic backgrounds, as defined by the ST (reference 21 and unpublished data). The degree to which an emm type is associated with different FCT region forms is not known, and it is probably not correct to simply assume that a particular emm type has a strict correlation with the FCT region form assigned to the 113 isolates of this study. In addition, a given emm type is often found in association with numerous T serotypes (33). However, much remains to be uncovered about the molecular relationship between the T serotype and the FCT region genotype. The recent finding that at least three FCT region gene products are essential for measuring the T type of one strain (42) may signify that extensive nucleotide sequence determination is required to relate a T type to an FCT region genotype. Conceivably, the molecular determinants of the T serotype lie within the highly heterogeneous portions of just a few genes, such as cpa or fctA. The modes of genetic diversification that give rise to different T types are unknown and may involve mutation, intragenic recombination, and/or en bloc transfer of the entire FCT region. The many non-T-typeable strains that exist (33) may also pose difficulties in relating the T type to the FCT region genotype. The fact that FCT region genes are either positively or negatively regulated under the same growth conditions according to strain (35, 37, 48) may undermine the phenotypic approach used for T-serotype determination. For the molecular epidemiologic analysis of S. pyogenes, the sequenced-based typing methods of emm typing and MLST should be regarded as the gold standards.
Despite the potential for variance in the relationship between individual emm types and FCT region genotypes, the data analysis of this report groups emm types together in accordance with their emm patterns, and comparisons are made between the emm pattern group and either the FCT region form or specific FCT region genes. The emm pattern group sizes range from 20 (A-C) to 55 (E) isolates and consist of genetically distant strains, so in many ways, the sample set is quite robust. The degree to which more extensive sampling will alter the general findings of this report remains to be determined.
Pathways of evolution. Structural analysis of the FCT regions of the S. pyogenes population supports a key role in evolution for recombination leading to gene replacements. The gain or loss of genes can be assessed by maximum parsimony, a character state-based method for inferring phylogenetic trees. The characters (i.e., genes) are assigned a state (i.e., presence or absence, or lineage), and the ordered series of character states yields a taxon. It is assumed that the relative chromosomal orders of the genes under study are the same in all strains, which seems justified, since there are no examples of gene order reversal (Fig. 1).
The nine FCT region genes targeted for PCR amplification yield 13 distinct taxa among the 113 strains (based on data in Table S1 in the supplemental material). A phylogenetic tree was constructed by maximum parsimony by including one representative of each taxon in the data matrix (Fig. 3). Although some of the FCT region forms are represented by multiple taxa (FCT-3, -4, and -7), they lie along adjacent branches. Each strain is denoted by a symbol at one of the nodes represented by the 13 taxa, in accordance with the emm pattern genotype. Most emm pattern D strains cluster at one end of the tree, whereas emm pattern A-C strains are scattered at many nodes throughout, without strong clustering in any one region. Interspersed among the branches leading to pattern A-C strains are the emm pattern E strains, although the latter group forms two dense clusters that are separated by four genetic changes. The data provide evidence that the majority of FCT regions present within emm pattern E strains evolved along two discrete pathways.
|
10 kb upstream of emm in some strains and encodes a multifunctional protein that can bind human fibronectin (36). Mga can regulate the transcription of emm, sof, and rofA/nra, whereas RofA/Nra can effect the transcription of mga, prtF1, the series of cpa to fctB genes, and prtF2 (2, 7, 24, 35, 37, 48, 61). Furthermore, both the mga and rofA/nra loci are subject to autoregulation. Some of the gene-to-gene interactions may be strain dependent. The emm and FCT regions are separated by
0.3 Mb and do not lie in close proximity on the chromosome.
|
A striking feature of the mixed FCT and emm region tree (Fig. 4) is the placement of most emm pattern E strains (generalists) between most throat specialist (pattern A-C) and skin specialist (pattern D) strains, so that the specialists cluster at opposite ends. This model is consistent with a fundamental concept of ecology and evolution, which states that specialists tend to emerge from generalists (20). Even if one of the specialist genotypes was the S. pyogenes progenitor, organisms would need to have evolved through the generalist stage in order to spawn new populations of specialists.
Concluding remarks. The FCT region of S. pyogenes appears to be important in pathogenesis, because several of its products give rise to MSCRAMMs and pilus-like appendages that may facilitate interactions between the bacterium and host. The FCT region also displays a wide variety of structural forms and contains signatures of past recombination events, indicative of selective pressures that result in adaptation of the organism to a variety of environmental conditions.
| ACKNOWLEDGMENTS |
|---|
This work was supported by funding from the National Institutes of Health (AI053826, AI061454, and GM060793) and the American Heart Association (to D.E.B.).
| FOOTNOTES |
|---|
Published ahead of print on 6 October 2006. ![]()
Present address: Department of Medicine Unit I and Infectious Diseases, Christian Medical College, Vellore, India. ![]()
Supplemental material for this article may be found at http://jb.asm.org/. ![]()
| REFERENCES |
|---|
|
|
|---|