This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrowReprints and Permissions
Right arrow Copyright Information
Right arrow Books from ASM Press
Right arrow MicrobeWorld
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by McGregor, K. F.
Right arrow Articles by Bessen, D. E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by McGregor, K. F.
Right arrow Articles by Bessen, D. E.

 Previous Article  |  Next Article 

Journal of Bacteriology, July 2004, p. 4285-4294, Vol. 186, No. 13
0021-9193/04/$08.00+0     DOI: 10.1128/JB.186.13.4285-4294.2004
Copyright © 2004, American Society for Microbiology. All Rights Reserved.

Multilocus Sequence Typing of Streptococcus pyogenes Representing Most Known emm Types and Distinctions among Subpopulation Genetic Structures

Karen F. McGregor,1 Brian G. Spratt,1 Awdhesh Kalia,2,3 Alicia Bennett,3 Nicole Bilek,1 Bernard Beall,4 and Debra E. Bessen3,5*

Department of Infectious Disease Epidemiology, Imperial College London, London, United Kingdom,1 Department of Molecular Microbiology, Washington University School of Medicine, St. Louis, Missouri,2 Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut,3 Respiratory Diseases Branch, Centers for Disease Control and Prevention, Atlanta, Georgia,4 Department of Microbiology and Immunology, New York Medical College, Valhalla, New York5

Received 17 December 2003/ Accepted 1 April 2004


arrow
ABSTRACT
 
A long-term goal is to characterize the full range of genetic diversity within Streptococcus pyogenes as it exists in the world today. Since the emm locus is subject to strong diversifying selection, emm type was used as a guide for identifying a genetically diverse set of strains. This report contains a description of multilocus sequence typing based on seven housekeeping loci for 495 isolates representing 158 emm types, yielding 238 unique combinations of sequence type and emm type. A genotypic marker for tissue site preference (emm pattern) revealed that only 17% of the emm types displayed the marker representing strong preference for infection at the throat and that 39% of emm types had the marker for skin tropism, whereas 41% of emm types harbored the marker for no obvious tissue site preference. As a group, the emm types bearing the emm pattern marker indicative of no obvious tissue site preference were far less likely to have two distinct emm types associated with the same sequence type than either of the two subpopulations having markers for strong tissue tropisms (P < 0.002). In addition, all genetic diversification events clearly ascribed to a recombinational mechanism involved strains of only two of the emm pattern-defined subpopulations, those representing skin specialists and generalists. The findings suggest that the population genetic structure differs for the tissue-defined subpopulations of S. pyogenes. The observed differences may partly reflect differential host immune selection pressures.


arrow
INTRODUCTION
 
Streptococcus pyogenes, also known as group A beta-hemolytic streptococcus (GAS), is among the most highly prevalent bacterial pathogens and has a worldwide distribution. Humans are the only known biological host. Although these organisms can cause severe invasive disease or give rise to an asymptomatic carrier state, most often they cause mild disease by infecting the upper respiratory tract or skin, resulting in pharyngitis or impetigo, respectively (7). The relative incidence of GAS disease varies throughout the world, in accordance with both season and locale. In the temperate regions of North America and Europe, pharyngitis is highly prevalent during the winter months and impetigo (although less common) is most often encountered during warmer weather. In many tropical regions, GAS impetigo is far more common than pharyngeal infection (4), and there may be no discrete seasonal peaks in incidence of disease (30).

Numerous typing schemes have been used to characterize and measure the genetic diversity among isolates of S. pyogenes. Perhaps the most common tool used today is emm typing (3, 13), which is based on sequence at the 5' end of a locus (emm) that is present in all isolates. The targeted region of emm displays the highest level of sequence polymorphism known for a widely distributed S. pyogenes gene; >150 emm types have been described to date (B. Beall, http://www.cdc.gov/ncidod/biotech/strep/emmtypes.htm). emm encodes the M protein, which forms the basis of a serological typing scheme (28). For many M proteins, the type-specific epitopes elicit strong host protective immunity (23).

There are four major subfamilies of emm genes, which are defined by sequence differences within the 3' end, encoding the peptidoglycan-spanning domain (22). The chromosomal arrangement of emm subfamily genes reveals five major emm patterns, denoted emm patterns A through E (6); strains with patterns B and C are rare and are currently grouped with emm pattern A strains (referred to as pattern A-C strains). A given isolate of S. pyogenes has one, two, or three emm genes lying in tandem on the chromosome, and each gene differs in sequence from the others. In strains having three emm genes, the determinants of emm type lie within the central emm locus.

The emm pattern A-C strains are usually recovered from cases of pharyngitis, whereas emm pattern D strains are most often isolated from impetigo lesions (4, 6, 10). As a group, emm pattern E strains are readily found at both primary tissue sites. For example, in tropical Australia, 84% of isolates recovered by population-based surveillance of an aboriginal community experiencing high rates of streptococcal impetigo and no cases of pharyngitis were either emm pattern D or E (4). In Rome, 98% of pharyngitis isolates were of emm types associated with emm pattern A-C or E (10). Thus, emm pattern can serve as a genotypic marker for tissue site preferences among S. pyogenes strains.

Multilocus sequence typing (MLST) is a relatively new tool for molecular typing of bacteria (8, 33). A principal advantage of MLST over gel-based methods is that the sequence data, which are generated for several neutral housekeeping loci, are unambiguous, electronically portable, and readily queried via the Internet (www.mlst.net). In this report, MLST and emm pattern determination are performed for many previously untested emm types of S. pyogenes. When these data are combined with data from previous reports (10, 12, 31), it is found that the large majority of known emm types (http://www.cdc.gov/ncidod/biotech/strep/emmtypes.htm) are represented. An analysis of the relationships among emm type, emm pattern, and the genetic relatedness defined by MLST is presented.


arrow
MATERIALS AND METHODS
 
Bacteria. The 107 new GAS isolates under study are listed in Table 1. Selection of bacterial isolates for this study was largely guided by knowledge of previously determined emm types, for the purpose of assembling a strain set having maximal diversity in emm type. Some new isolates were selected in order to ascertain emm pattern for more than one isolate of a given emm type. Strains designated SS followed by a string of three or four numerals are part of the Centers for Disease Control and Prevention (CDC) strain collection; strains designated with two, three, or four numerals followed by a hyphen and then two additional numerals, where the last two numerals in the series represent year of acquisition, are also part of the CDC strain collection; additional epidemiological information is posted at http://www.cdc.gov/ncidod/biotech/strep/emmtypes.htm for many strains. All other isolates from Australia were provided by K. Sriprakash and B. Currie. Strain CT95-201 was obtained from the State of Connecticut Department of Health (18). All other strains were obtained from the Lancefield collection (The Rockefeller University, New York, N.Y.).


View this table:
[in this window]
[in a new window]
 
TABLE 1. New GAS isolates associated with this report

Of the 495 isolates, the tissue site of isolation was unknown for 73 (15%), 183 (37%) were derived from impetigo lesions, 136 (27%) were recovered from normally sterile tissue sites, and 103 (21%) were from the upper respiratory tract. Of the upper respiratory tract isolates, 23 (22%) were known to be recovered from subjects with no disease symptoms, whereas 54 (52%) were definitively associated with disease. Isolates from impetigo lesions and normally sterile sites are, by definition, disease associated.

emm sequence typing. emm type, which closely corresponds to M serotype, was ascertained by nucleotide sequence determination as previously described (3, 12, 29); a unique emm type is defined as having <95% sequence identity to any other known type over the first 160 bp of sequence, allowing for small indels. A complete and current listing of GAS emm types is posted at ftp://ftp.cdc.gov/pub/infectious_diseases/biotech/emmsequ/ and http://www.cdc.gov/ncidod/biotech/strep/emmtypes.htm). emm pattern was determined by a PCR-based method, as previously described (4).

MLST. Internal fragments of seven housekeeping genes (gki, gtr, murI, mutS, recP, xpt, and yqiL) were amplified and sequenced with primers and under conditions described previously (12). For each locus, distinct allele numbers were assigned to each unique sequence, generating a seven-integer allelic profile for each isolate. Isolates with identical allelic profiles were assigned to the same sequence type (ST). A complete database of alleles, allele sequences, and STs is maintained on the Internet at www.mlst.net.

Additional nucleotide sequence determination. Using bacterial DNA as a template, PCR amplification products were generated (annealing temperature, 50 or 55°C) with the following oligonucleotide primers: for the cpa locus, 5'-GGA TAT GAG ATT GCC GAA CCT ATT ACT TTT AAA G-3' (forward) and 5'-GGA GCC TGT TTA TCT TCC ATT CGA ATA ATA TCC AC-3' (reverse) (product size, ~600 bp); for the prtF1 locus, 5'-TGC GCG GGT TCT ATC GGT TTT GGT CAA GTA-3' (forward) and 5'-AAT TAG TTT T(T/C)T CA(G/A) (T/A)GC (T/C)TC ACG CAT TAA-3' (reverse) (product size, ~360 bp). The same primers were used for nucleotide sequence determination.

Computational analysis. Sequence (nucleotide and amino acid) alignments and percent sequence identity calculations were performed with Clustal W (DNAStar; version 5.05). The eBURST algorithm was applied with software available at http://eburst.mlst.net (15). Average distances between STs was calculated by the START-distance matrix method (www.mlst.net). For tests for independence, Fisher's two-tailed exact test was used (DnaSP; version 3.99).

Nucleotide sequence accession numbers. The new housekeeping allele sequences generated as part of this report were submitted to GenBank and assigned accession numbers AY520918 through AY521006. The new allele sequences associated with the cpa and prtF1 loci were submitted to GenBank and assigned accession numbers AY579608 through AY579635.


arrow
RESULTS
 
MLST of GAS. Allelic profiles at seven housekeeping loci were determined for 107 isolates of GAS (Table 1). The majority of isolates (75%) represent emm types not previously reported on for MLST. When these data were combined with previously reported data on 388 GAS isolates, 220 STs were recognized (10, 12, 31) (www.mlst.net). The total number of alleles at each housekeeping locus ranged from 36 (mutS) to 66 (gki). Collectively, 158 distinct emm types are included in the set of 495 isolates and 238 unique emm type-ST combinations can be defined. These 158 emm types represent the large majority of known emm types found in GAS, as defined by ftp://ftp.cdc.gov/pub/infectious_diseases/biotech/emmsequ/ and http://www.cdc.gov/ncidod/biotech/strep/emmtypes.htm.

Markers for tissue site preference. emm pattern serves as a useful genotypic marker for tissue site preferences of individual strains and clones. Of the 158 emm types represented within the complete set of 495 isolates, emm pattern was established for one or more isolates of 156 emm types (Table 2). Of the 76 emm types for which emm pattern was determined for two or more isolates, 74 (97%) of the emm types included isolates belonging to a single emm pattern group (i.e., A-C, D, or E). Only two emm types (54 and st854) were found in association with two emm pattern groups. Therefore, isolates of a given emm type usually have the same emm pattern grouping.


View this table:
[in this window]
[in a new window]
 
TABLE 2. emm types according to emm pattern marker for tissue site preference

The classical throat strains (emm pattern A-C) displayed the least diversity in emm type, accounting for only 17% of the 156 emm types that could be assigned an emm pattern (Table 2). emm types associated with patterns D and E were most abundant, representing 39 and 41%, respectively, of the total emm types. Two of the emm types (st1815 and st211) had a rearranged emm region. The data show that emm pattern D and E strains display the most diversity in emm type, whereas pattern A-C strains display the least.

The relationship between emm pattern subpopulations and genetic diversity, as defined by MLST, was also evaluated. Of the 220 STs resolved by MLST, emm pattern was determined for at least one representative of 202 STs. The classical throat strains (emm pattern A-C) displayed the least genetic diversity in their allelic profiles, accounting for only 18% of the 202 STs examined. STs associated with patterns D and E were most abundant, representing 36 and 47%, respectively, of the total number of STs. The data show that emm pattern E strains, as a group, display the most diversity in ST, whereas pattern A-C strains display the least. Pattern D strains are intermediate in their overall diversity of STs.

Relationships among STs. Of the 220 STs of GAS, the average distance from an ST to all other STs was 6.21 housekeeping alleles, calculated by the START-distance matrix method. The mean distance of an ST to the ST with the most similar allelic profile was 2.35 housekeeping alleles. Thus, many STs are distally related to all others.

eBURST is an algorithm that can be used to subdivide MLST data into nonoverlapping groups of STs with a user-defined level of similarity in their allelic profiles (15). The most stringent definition of an eBURST group, where all STs assigned to the same group must share alleles at at least six of the seven MLST loci with at least one other ST in the group, identifies clusters of closely related genotypes that are considered to be descended from the same founder and that are defined as clonal complexes (15). To obtain a population snapshot, the group definition is set at zero of seven shared housekeeping alleles. Thirty-one clonal complexes were observed among the 220 STs with eBURST, and most of these were small clusters of two or three linked STs (Fig. 1). eBURST identifies the most likely founder of a clonal complex and provides bootstrap support for the assignment. For the 220 GAS STs, a founder ST was assigned in only 11 of the 31 clonal complexes; 65% of the clonal complexes were doublets where the direction of evolution is unknown. However, the bootstrap support was <70% for all founder STs, except for ST65 (99% confidence). For each of the 31 clonal complexes, all STs had emm types belonging to the same emm pattern group (Table 2).



View larger version (11K):
[in this window]
[in a new window]
 
FIG. 1. Population snapshot by eBURST. The entire S. pyogenes database of 495 isolates is displayed as a single eBURST diagram, by setting the group definition to zero of seven shared alleles, which places all isolates in a single group. Each dot represents an ST, and the size of the dot reflects the number of GAS isolates in each ST for the set of 495 isolates under study. STs that differ by a single locus are linked with a solid line; clusters of linked isolates correspond to clonal complexes. Founder STs are labeled (arrows), although, except for ST65, the bootstrap support for the founders was low. The distribution and spacing of unlinked STs and clonal complexes in a population snapshot are not relational and provide no information about the genetic distances between them.

The relative contributions of point mutation and recombination to the initial stages of clonal diversification can be assessed from MLST data, by identifying those STs that are very closely related, differing at only one of the seven MLST loci (single-locus variants [SLVs]). The sequences of the alleles at the single altered locus are then analyzed to distinguish whether the change in the housekeeping gene has occurred by recombination or by mutation (14, 16). The criteria for assigning an allelic change as resulting from mutation or recombination used by Feil et al. (16) are based on the identification of the founder ST and its associated SLVs within a clonal complex. Within the GAS data set there are 48 pairs of STs that differ at a single locus, but founders cannot be confidently predicted in the majority of cases. In this study, the assignment of the allelic change in SLV pairs as the result of mutation or recombination was therefore based on the following assumptions. If there are multiple (more than one) nucleotide differences among the alleles at the locus that differs among SLVs, a recombinational event is assumed to have occurred, because the probability of multiple independent point mutations at one locus with none at any of the other six loci is low. If there is only a single nucleotide difference, assignment of the variant as the result of mutation or recombination is more complicated (16). A random point mutation is expected to produce a novel allele restricted to the SLV in which it arises; however, alleles introduced by recombination should be present in other strains in the population and most may be present in a large MLST database. In this study, where both alleles at the variant locus of the SLV pair were found in distantly related STs, we assume that the allelic change was due to recombination. The remaining alleles that differ at a single site will probably still include some that arose by recombination from a donor allele that is absent from the data set, and thus this procedure provides a minimum estimate of the extent of recombination compared to point mutation.

Among the 48 SLV pairs identified by eBURST among the 220 STs (Fig. 1), 20 allelic changes were designated recombination events, based on multiple nucleotide differences among alleles at the variant locus (data not shown). The remaining 28 SLV pairs had a single nucleotide difference among the alleles. Of these, in eight cases both alleles of the SLV pair were present in one or more distantly related STs. Thus, 28 of the allelic changes were considered to be due to recombination and ≤20 were considered to be due to point mutation, and housekeeping loci in GAS are estimated to change by recombination at least 1.4 times more frequently than by point mutation.

Of the 28 allelic changes classified as recombination events, all involved emm pattern D or E strains (16 and 12 genetic events, respectively). Further studies are required to obtain a more precise estimate of the ratio of recombination to mutation and to firmly establish whether recombination is a more common mode of evolutionary change at housekeeping loci in emm pattern D and E strains than in pattern A-C strains.

Association of multiple emm types with a single ST. The great majority of STs were found in association with a single emm type (208 of 220; 95%). Only 12 STs included isolates of two or more different emm types (Table 3); these are referred to as emm-variable STs. However, the 12 emm-variable STs involved a disproportionately large fraction of the total number of emm types (30 of 158, 19%). Three emm-variable STs were associated with emm pattern A-C strains, eight were associated with pattern D strains, and only one was associated with pattern E strains. None of the STs were associated with emm types corresponding to different emm pattern groups.


View this table:
[in this window]
[in a new window]
 
TABLE 3. STs associated with more than one emm type

Eight (28%) of the 29 emm types associated with emm pattern A-C strains (Table 2) were found among emm-variable STs (ST65, -83, and -84; Table 3). Similarly, for emm pattern D strains, 20 of the 63 emm types (32%) were associated with emm-variable STs (ST3, -4, -9, -10, -11, -123, -174, and -182). In sharp contrast, only 2 of the 64 pattern E emm types (3%) were associated with an emm-variable ST (ST39). Thus, unlike the emm types characteristic of emm pattern A-C and D strains, the STs of emm pattern E isolates rarely include isolates with more than one emm type (P < 0.002; Fisher's exact test, two-tailed).

The extent of similarity between the emm sequences of those isolates that have the same ST but different emm types was examined, as this may distinguish variation in emm type that has arisen by the accumulation of point mutations from that arising by horizontal gene transfer. For many of these emm-variable STs, the different emm types have <50% nucleotide sequence identity, and, for all emm types associated with the same ST, the emm type sequences were ≤91% identical in nucleotide sequence and ≤84% identical in the corresponding amino acid sequence of the M protein (Table 3). However, close examination of sequence alignments suggests that emm type st1RP31 arose from emm type 30 via intragenic recombination resulting in small deletions; both strains are ST65. Furthermore, emm type sts104 appears to have arisen via fusion of the leader-coding region of emm4 with a downstream emm gene (enn4), on an ST39 genetic background, although the emm type sts104 strain was successfully mapped as emm pattern E. Aside from the two exceptions noted, the large number of sequence differences between emm types strongly suggests that horizontal transfer of emm followed by intergenomic recombination is the primary mechanism underlying emm-variable STs, rather than intragenomic recombination or divergence by point mutation.

Analysis of other adaptive loci in emm-variable STs. If recombinational replacement of emm type is a recent event, then other loci distant from emm on the genome should display little or no sequence variation among isolates that have the same ST but which differ in emm type. The FCT (fibronectin-collagen-T antigen) region of the GAS genome encodes surface proteins that bind host extracellular matrix proteins (fibronectin and collagen). The FCT region displays high overall levels of genetic diversity and lies ~300 kb from the emm region (5, 17, 27, 32). Two FCT region genes, prtF1 and cpa, were examined for sequence diversity in GAS isolates sharing the same ST but differing in emm type (Table 4).


View this table:
[in this window]
[in a new window]
 
TABLE 4. Genetic diversity at other adapative loci for isolates of differing emm types sharing an ST

For 10 isolates possessing the prtF1 locus (emm patterns A-C and E) and representing four STs and 10 emm types, four major sequence clusters of partial prtF1 genes, corresponding to the 5' end region, were identified (Table 4). The percent nucleotide sequence identity among alleles belonging to different prtF1 sequence clusters ranged from 62 to 69%, whereas the amino acid sequence identity among different clusters ranged from 46 to 58%, suggestive of a history of strong diversifying selection at the prtF1 locus. However, in each case, all isolates with the same ST but different emm types also had identical prtF1 alleles. The lack of variation among prtF1 alleles within emm-variable STs adds further support to the idea that recombinational replacements that lead to variation in the emm type of isolates of a single ST are recent evolutionary events in emm pattern A-C strains.

Many emm pattern D strains harbor a cpa gene, rather than the prtF1 gene, within their FCT regions. Of 20 pattern D isolates, belonging to eight STs and representing 20 different emm types, the nucleotide sequence was determined for an internal portion (5' end region) of the cpa gene for 18 strains (Table 4). The partial cpa genes formed three discrete sequence clusters, with two alleles in each major cluster. The percent nucleotide sequence identity among cpa alleles belonging to the same sequence cluster was high (>99%). However, the percent nucleotide sequence identity among alleles belonging to different cpa sequence clusters was much lower, ranging from 62 to 68%; the amino acid sequence identity among different clusters ranged from 49 to 60%. The sequence data suggest that, like prtF1, the cpa locus has a history of being subject to strong diversifying selection.

In contrast to what was found for prtF1, strains having distinct emm types but the same ST were not necessarily uniform in their cpa genes (Table 4). Although four of the seven emm-variable STs examined had identical cpa alleles in strains with different emm types (ST3, -123, -174, and -182), two emm-variable STs had cpa genes belonging to distant sequence clusters (ST9 and -11); in a third (ST4), the cpa fragment could not be amplified from one of the two strains with the cpa primers. These findings suggest that, for the emm pattern D subpopulation, the emergence of strains of the same ST, but with different emm types, may in some cases be more complex than a one-step recombinational replacement of emm.


arrow
DISCUSSION
 
The findings presented in this report are part of a long-term effort to gain a comprehensive understanding of the genetic diversity present within this medically important bacterial species. emm types are commonly used to characterize GAS, and at least one representative strain of the large majority of currently known emm types was examined for both ST (based on housekeeping genes) and a genetic marker for tissue site preference (based on emm pattern). The lower level of genetic diversity, for both emm type and ST, that was observed among the throat strain group (emm pattern A-C) may reflect its lower prevalence among the world's human host population: The majority of human hosts inhabit tropical and semitropical regions of the developing world, in which the incidence of streptococcal skin infection is generally high and the incidence of streptococcal pharyngitis is often moderate to low.

At least 58% (28 of 48) of the recent changes at housekeeping loci in GAS appear to be due to recombination, and this value may be substantially greater, since many alleles among SLVs that differ at a single nucleotide site may have arisen by recombination involving a very similar donor allele rather than by point mutation. The best estimate at present is that recombination changes alleles of housekeeping loci at least 1.4 times more commonly than point mutation. The major contribution of recombination to allelic change is consistent with previous findings that demonstrated a complete absence of congruence among the gene tree topologies for the seven MLST loci, for GAS genotypes representing all emm pattern groups (14). The lack of congruency between loci suggests that, in the long term, recombination has eliminated all phylogenetic signal from gene trees. This finding is further supported by a lack of strong bootstrap support in a phylogenetic tree based on concatenated housekeeping alleles (25).

The emm pattern A-C subpopulation (throat specialists) of S. pyogenes may differ from the skin specialists (emm pattern D) and generalists (emm pattern E) in the relative impact of recombination compared to point mutation in genetic diversification at housekeeping loci. Those recombinational changes at MLST loci that can clearly be discerned appear to have been much more common in emm pattern group D and E strains than in pattern A-C strains. This trend was also observed in an analysis of congruence among housekeeping gene tree topologies, where 5, 0, and 1 of the 42 possible pairwise tree comparisons were significantly congruent for the emm pattern A-C, D, and E subpopulations, respectively (25). Although allelic changes by recombination were less readily detected among the emm pattern A-C strains using eBURST, it is important to emphasize that recombination was observed in all of the emm pattern-defined subpopulations according to several analytic methods (25).

The total number of STs within each clonal complex identified by eBURST was rather low and probably reflects our sampling strategy. In general, eBURST may identify few clonal complexes, and few large clonal complexes, in populations where sampling has largely been designed to uncover the genetic diversity within the species (11, 16, 34), as in this work, where a small number of isolates or a single isolate of most emm types was examined. Thus, a more optimal sampling of GAS will be required for identifying many additional clonal complexes, for defining their founding genotypes, and for exploring the patterns of descent, in order to provide a better assessment of the impact of recombination and mutation.

The data suggest that a significant proportion of emm pattern A-C and D strains, but not pattern E strains, have a recent history of recombinational replacement of emm type, yielding STs that are associated with multiple, divergent emm types. These events may be relatively recent, as no variation in a gene that is believed to be under diversifying selection (prtF1) was detected in isolates of the emm-variable STs of emm pattern A-C. Pattern D strains generally lack prtF1 but instead harbor cpa, which is located at the same approximate position within the genome (5, 32). Not all pattern D strains sharing the same ST and harboring divergent emm types had the same cpa allele; distant sequence clusters of cpa genes were observed on the same ST background in association with different emm types. Thus, in some cases, diversification at the rapidly evolving cpa locus may have occurred subsequent to the recombinational replacement of the emm gene. Analysis of additional loci may aid in obtaining a more complete understanding of the recent evolutionary history of these strains.

Recombinational replacement of emm type, which may occur during coinfection of a single host tissue site by multiple GAS strains, can potentially provide an avenue for immune escape. The ability of a strain to successfully be transmitted to a new human host diminishes as protective immunity arising from infection gradually builds among the host population (1, 19, 20). For many GAS strains, the type-specific epitopes of the M protein elicit strong protective immunity (2, 9, 23, 24, 28). If the emm type of a parent (recipient) strain is replaced with a new emm type from an unrelated donor strain, the new genotype may have a strong selective advantage if the host population is largely nonimmune to the emm type of the donor strain and immune to the emm type of the parent strain. The ability to recover multiple emm types in association with a single ST through epidemiologic sampling, as shown in this report, may reflect past strain-to-strain competition mediated through herd immunity. Patients with impetigo often differ from those with pharyngitis in their immune response to specific S. pyogenes antigens (26). This may be the result of fundamental differences in the host immune response to infection at these two tissue sites, which in turn, may provide a basis for differential selection pressures on the subpopulations of strains. Examination of the relationships between alleles at neutral (housekeeping) and adaptive (e.g., emm, prtF1, and cpa) loci of GAS may allow one to make reasonable predictions on the strength of host immune selection acting on each adaptive locus.

Of the 48 SLV pairs identified by eBURST, 17 pairs were represented by an ST that was also a recipient for recombinational replacement of emm type. In fact, all except 1 of the 12 emm-variable STs (ST182) were represented among the clonal complexes identified by eBURST. Among the 17 SLV pairs represented by an emm-variable ST, nine (53%) of the genetic diversification events at housekeeping loci were attributed to recombination; however, in most cases, it remained unclear as to whether the emm-variable ST was the likely ancestral ST. Frequent acquisition of genes via horizontal transfer could be due to high prevalence of the recipient strain within the human host population, with increased opportunities to be present within mixed infections, or, alternatively, could be due to intrinsic properties that render certain STs highly efficient as recipients of recombinational and/or lateral gene transfer events. It is perhaps of relevance here that some strains of S. pyogenes appear to be naturally transformable and that, furthermore, the locus (sil) that confers the competence phenotype has a limited distribution among strains (21). Generalized transduction may be an important mechanism for horizontal transfer leading to homologous recombination in S. pyogenes.

A comprehensive catalogue of STs and emm patterns for the majority of known emm types of GAS, as presented in this report, provides a foundation for addressing questions on the population substructure of this biologically diverse bacterial pathogen. emm pattern D and E strains account for >80% of emm types, and therefore, from a global standpoint, these strains are of medical importance. The STs of emm pattern E isolates are rarely associated with one or more emm types; divergent emm types associated with the same ST were a far more common feature of pattern A-C and D emm types; however, the genetic mechanisms underlying the emergence of population structures of the emm pattern A-C versus emm pattern D subpopulations seem to be distinct. Genetic diversification by recombination appeared to be the dominant mechanism in emm pattern D and E strains but was less readily detectable among pattern A-C strains. When genetic diversification is combined with differential effects of host immune selection on each of the emm pattern-defined subpopulations, distinct population substructures can emerge.


arrow
ACKNOWLEDGMENTS
 
We thank the many investigators worldwide who provided GAS strains to the CDC and the investigators who provided strains to D.E.B.

This work was supported by the National Institutes of Health (GM60793, to D.E.B. and B.G.S.; AI053826, to D.E.B.), the American Heart Association (grant-in-aid, to D.E.B.), and the Wellcome Trust (to B.G.S.). B.G.S. is a Wellcome Trust Principal Research Fellow.


arrow
FOOTNOTES
 
* Corresponding author. Mailing address: New York Medical College, Department of Microbiology & Immunology, Valhalla, NY 10595. Phone: (914) 594-4193. Fax: (914) 594-4176. E-mail: debra_bessen{at}nymc.edu. Back


arrow
REFERENCES
 
    1
  1. Anderson, R. M. 1998. Analytic theory of epidemics, p. 23-50. In R. M. Krause (ed.), Emerging infections. Academic Press, New York, N.Y.
  2. 2
  3. Beachey, E. H., J. M. Seyer, J. B. Dale, W. A. Simpson, and A. H. Kang. 1981. Type-specific protective immunity evoked by synthetic peptide of Streptococcus pyogenes M protein. Nature 292:457-459.[CrossRef][Medline]
  4. 3
  5. Beall, B., R. Facklam, and T. Thompson. 1996. Sequencing emm-specific PCR products for routine and accurate typing of group A streptococci. J. Clin. Microbiol. 34:953-958.[Abstract]
  6. 4
  7. Bessen, D. E., J. R. Carapetis, B. Beall, R. Katz, M. Hibble, B. J. Currie, T. Collingridge, M. W. Izzo, D. A. Scaramuzzino, and K. S. Sriprakash. 2000. Contrasting molecular epidemiology of group A streptococci causing tropical and non-tropical infections of the skin and throat. J. Infect. Dis. 182:1109-1116.[CrossRef][Medline]
  8. 5
  9. Bessen, D. E., and A. Kalia. 2002. Genomic localization of a T-serotype locus to a recombinatorial zone encoding extracellular matrix-binding proteins in Streptococcus pyogenes. Infect. Immun. 70:1159-1167.[Abstract/Free Full Text]
  10. 6
  11. Bessen, D. E., C. M. Sotir, T. L. Readdy, and S. K. Hollingshead. 1996. Genetic correlates of throat and skin isolates of group A streptococci. J. Infect. Dis. 173:896-900.[Medline]
  12. 7
  13. Bisno, A. L., and D. Stevens. 2000. Streptococcus pyogenes (including streptococcal toxic shock syndrome and necrotizing fasciitis), p. 2101-2117. In G. L. Mandell, R. G. Douglas, and R. Dolin (ed.), Principles and practice of infectious diseases, 5th ed., vol. 2. Churchill Livingstone, Philadelphia, Pa.
  14. 8
  15. Chan, M. S., M. C. Maiden, and B. G. Spratt. 2001. Database-driven multi locus sequence typing (MLST) of bacterial pathogens. Bioinformatics 17:1077-1083.[Abstract/Free Full Text]
  16. 9
  17. Dale, J. B., and E. H. Beachey. 1986. Localization of protective epitopes of the amino terminus of type 5 streptococcal M protein. J. Exp. Med. 163:1191-1202.[Abstract/Free Full Text]
  18. 10
  19. Dicuonzo, G., G. Gherardi, G. Lorino, S. Angeletti, M. DeCesaris, E. Fiscarelli, D. E. Bessen, and B. Beall. 2001. Group A streptococcal genotypes from pediatric throat isolates in Rome, Italy. J. Clin. Microbiol. 39:1687-1690.[Abstract/Free Full Text]
  20. 11
  21. Enright, M. C., D. A. Robinson, G. Randle, E. J. Feil, H. Grundmann, and B. G. Spratt. 2002. The evolutionary history of methicillin-resistant Staphylococcus aureus (MRSA). Proc. Natl. Acad. Sci. USA 99:7687-7692.[Abstract/Free Full Text]
  22. 12
  23. Enright, M. C., B. G. Spratt, A. Kalia, J. H. Cross, and D. E. Bessen. 2001. Multilocus sequence typing of Streptococcus pyogenes and the relationship between emm type and clone. Infect. Immun. 69:2416-2427.[Abstract/Free Full Text]
  24. 13
  25. Facklam, R. F., D. R. Martin, M. Lovgren, D. R. Johnson, A. Efstratiou, T. A. Thompson, S. Gowan, P. Kriz, G. J. Tyrrell, E. Kaplan, and B. Beall. 2002. Extension of the Lancefield classification for group A streptococci by addition of 22 new M protein gene sequence types from clinical isolates: emm103 to emm124. Clin. Infect. Dis. 34:28-38.[CrossRef][Medline]
  26. 14
  27. Feil, E. J., E. C. Holmes, D. E. Bessen, M.-S. Chan, N. P. J. Day, M. C. Enright, R. Goldstein, D. Hood, A. Kalia, C. E. Moore, J. Zhou, and B. G. Spratt. 2001. Recombination within natural populations of pathogenic bacteria: short-term empirical estimates and long-term phylogenetic consequences. Proc. Natl. Acad. Sci. USA 98:182-187.[Abstract/Free Full Text]
  28. 15
  29. Feil, E. J., B. C. Li, D. M. Aanensen, W. P. Hanage, and B. G. Spratt. 2004. eBURST: inferring patterns of evolutionary descent among clusters of related bacterial genotypes from multilocus sequence typing data. J. Bacteriol. 186:1518-1530.[Abstract/Free Full Text]
  30. 16
  31. Feil, E. J., J. M. Smith, M. C. Enright, and B. G. Spratt. 2000. Estimating recombinational parameters in Streptococcus pneumoniae from multilocus sequence typing data. Genetics 154:1439-1450.[Abstract/Free Full Text]
  32. 17
  33. Ferretti, J. J., W. M. McShan, D. Ajdic, D. J. Savic, G. Savic, K. Lyon, C. Primeaux, S. Sezate, A. N. Suvorov, S. Kenton, H. S. Lai, S. P. Lin, Y. Qian, H. G. Jia, F. Z. Najar, Q. Ren, H. Zhu, L. Song, J. White, X. Yuan, S. W. Clifton, B. A. Roe, and R. McLaughlin. 2001. Complete genome sequence of an M1 strain of Streptococcus pyogenes. Proc. Natl. Acad. Sci. USA 98:4658-4663.[Abstract/Free Full Text]
  34. 18
  35. Fiorentino, T. R., B. Beall, P. Mshar, and D. E. Bessen. 1997. A genetic-based evaluation of principal tissue reservoir for group A streptococci isolated from normally sterile sites. J. Infect. Dis. 176:177-182.[Medline]
  36. 19
  37. Gupta, S., and R. Anderson. 1999. Population structure of pathogens: the role of immune selection. Parasitol. Today 15:497-501.[CrossRef][Medline]
  38. 20
  39. Gupta, S., M. C. J. Maiden, I. M. Feavers, S. Nee, R. M. May, and R. M. Anderson. 1996. The maintenance of strain structure in populations of recombining infectious agents. Nat. Med. 2:437-442.[CrossRef][Medline]
  40. 21
  41. Hidalgo-Grass, C., M. Ravins, M. Dan-Goor, J. Jaffe, A. E. Moses, and E. Hanski. 2002. A locus of group A Streptococcus involved in invasive disease and DNA transfer. Mol. Microbiol. 46:87-99.[CrossRef][Medline]
  42. 22
  43. Hollingshead, S. K., T. L. Readdy, D. L. Yung, and D. E. Bessen. 1993. Structural heterogeneity of the emm gene cluster in group A streptococci. Mol. Microbiol. 8:707-717.[Medline]
  44. 23
  45. Hu, M. C., M. A. Walls, S. D. Stroop, M. A. Reddish, B. Beall, and J. B. Dale. 2002. Immunogenicity of a 26-valent group A streptococcal vaccine. Infect. Immun. 70:2171-2177.[Abstract/Free Full Text]
  46. 24
  47. Jones, K. F., B. N. Manjula, K. H. Johnston, S. K. Hollingshead, J. R. Scott, and V. A. Fischetti. 1985. Location of variable and conserved epitopes among the multiple serotypes of streptococcal M protein. J. Exp. Med. 161:623-628.[Abstract/Free Full Text]
  48. 25
  49. Kalia, A., B. G. Spratt, M. C. Enright, and D. E. Bessen. 2002. Influence of recombination and niche separation on the population genetic structure of the pathogen Streptococcus pyogenes. Infect. Immun. 70:1971-1983.[Abstract/Free Full Text]
  50. 26
  51. Kaplan, E., B. Anthony, S. Chapman, E. Ayoub, and L. Wannamaker. 1970. The influence of the site of infection on the immune response to group A streptococci. J. Clin. Investig. 49:1405-1414.
  52. 27
  53. Kreikemeyer, B., K. S. McIver, and A. Podbielski. 2003. Virulence factor regulation and regulatory networks in Streptococcus pyogenes and their impact on pathogen-host interactions. Trends Microbiol. 11:224-232.[Medline]
  54. 28
  55. Lancefield, R. C. 1962. Current knowledge of the type specific M antigens of group A streptococci. J. Immunol. 89:307-313.
  56. 29
  57. Li, Z. Y., V. Sakota, D. Jackson, A. R. Franklin, B. Beall, and the Active Bacterial Core Surveillance/Emerging Infections Program Network. 2003. Array of M protein gene subtypes in 1064 recent invasive group A streptococcus isolates recovered from the active bacterial core surveillance. J. Infect. Dis. 188:1587-1592.[CrossRef][Medline]
  58. 30
  59. Martin, D. R., and K. S. Sriprakash. 1996. Epidemiology of group A streptococcal disease in Australia and New Zealand. Rec. Adv. Microbiol. 4:1-40.
  60. 31
  61. McGregor, K., N. Bilek, A. Bennett, A. Kalia, B. Beall, J. Carapetis, B. Currie, K. Sriprakash, B. Spratt, and D. Bessen. 2004. Group A streptococci from a remote community have novel multilocus genotypes but share emm-types and housekeeping alleles. J. Infect. Dis. 189:717-723.[CrossRef][Medline]
  62. 32
  63. Podbielski, A., M. Woischnik, B. A. B. Leonard, and K.-H. Schmidt. 1999. Characterization of nra, a global negative regulator gene in group A streptococci. Mol. Microbiol. 31:1051-1064.[CrossRef][Medline]
  64. 33
  65. Spratt, B. G. 1999. Multilocus sequence typing: molecular typing of bacterial pathogens in an era of rapid DNA sequencing and the Internet. Curr. Opin. Microbiol. 2:312-316.[CrossRef][Medline]
  66. 34
  67. Spratt, B. G., W. P. Hanage, and E. J. Feil. 2001. The relative contributions of recombination and point mutation to the diversification of bacterial clones. Curr. Opin. Microbiol. 4:602-606.[CrossRef][Medline]


Journal of Bacteriology, July 2004, p. 4285-4294, Vol. 186, No. 13
0021-9193/04/$08.00+0     DOI: 10.1128/JB.186.13.4285-4294.2004
Copyright © 2004, American Society for Microbiology. All Rights Reserved.




This article has been cited by other articles:

  • Ahmad, Y., Gertz, R. E. Jr., Li, Z., Sakota, V., Broyles, L. N., Van Beneden, C., Facklam, R., Shewmaker, P. L., Reingold, A., Farley, M. M., Beall, B. W. (2009). Genetic Relationships Deduced from emm and Multilocus Sequence Typing of Invasive Streptococcus dysgalactiae subsp. equisimilis and S. canis Recovered from Isolates Collected in the United States. J. Clin. Microbiol. 47: 2046-2054 [Abstract] [Full Text]  
  • Zheng, M.-h., Jiao, Z.-q., Zhang, L.-j., Yu, S.-j., Tang, G.-p., Yan, X.-m., He, L.-h., Meng, F.-l., Zhao, F., Zhang, M.-j., Xiao, D., Yang, Y.-h., Nie, W., Zhang, J.-z., Wang, Z.-j. (2009). Genetic Analysis of Group A Streptococcus Isolates Recovered during Acute Glomerulonephritis Outbreaks in Guizhou Province of China. J. Clin. Microbiol. 47: 715-720 [Abstract] [Full Text]  
  • Luca-Harari, B., Straut, M., Cretoiu, S., Surdeanu, M., Ungureanu, V., van der Linden, M., Jasir, A. (2008). Molecular characterization of invasive and non-invasive Streptococcus pyogenes isolates from Romania. J Med Microbiol 57: 1354-1363 [Abstract] [Full Text]  
  • Ayer, V., Tewodros, W., Manoharan, A., Skariah, S., Luo, F., Bessen, D. E. (2007). Tetracycline Resistance in Group A Streptococci: Emergence on a Global Scale and Influence on Multiple-Drug Resistance. Antimicrob. Agents Chemother. 51: 1865-1868 [Abstract] [Full Text]  
  • Lizano, S., Luo, F., Bessen, D. E. (2007). Role of Streptococcal T Antigens in Superficial Skin Infection. J. Bacteriol. 189: 1426-1434 [Abstract] [Full Text]  
  • Kratovac, Z., Manoharan, A., Luo, F., Lizano, S., Bessen, D. E. (2007). Population Genetics and Linkage Analysis of Loci within the FCT Region of Streptococcus pyogenes. J. Bacteriol. 189: 1299-1310 [Abstract] [Full Text]  
  • Palmieri, C., Vecchi, M., Littauer, P., Sundsfjord, A., Varaldo, P. E., Facinelli, B. (2006). Clonal Spread of Macrolide- and Tetracycline-Resistant [erm(A) tet(O)] emm77 Streptococcus pyogenes Isolates in Italy and Norway. Antimicrob. Agents Chemother. 50: 4229-4230 [Full Text]  
  • Siljander, T., Toropainen, M., Muotiala, A., Hoe, N. P., Musser, J. M., Vuopio-Varkila, J. (2006). emm typing of invasive T28 group A streptococci, 1995-2004, Finland.. J Med Microbiol 55: 1701-1706 [Abstract] [Full Text]  
  • Szczypa, K., Sadowy, E., Izdebski, R., Strakova, L., Hryniewicz, W. (2006). Group A Streptococci from Invasive-Disease Episodes in Poland Are Remarkably Divergent at the Molecular Level. J. Clin. Microbiol. 44: 3975-3979 [Abstract] [Full Text]  
  • Robinson, D. A., Sutcliffe, J. A., Tewodros, W., Manoharan, A., Bessen, D. E. (2006). Evolution and global dissemination of macrolide-resistant group a streptococci.. Antimicrob. Agents Chemother. 50: 2903-2911 [Abstract] [Full Text]  
  • Sakota, V., Fry, A. M., Lietman, T. M., Facklam, R. R., Li, Z., Beall, B. (2006). Genetically diverse group a streptococci from children in far-Western Nepal share high genetic relatedness with isolates from other countries.. J. Clin. Microbiol. 44: 2160-2166 [Abstract] [Full Text]  
  • Littauer, P., Caugant, D. A., Sangvik, M., Hoiby, E. A., Sundsfjord, A., Simonsen, G. S., the Norwegian Macrolide Study Group, (2006). Macrolide-Resistant Streptococcus pyogenes in Norway: Population Structure and Resistance Determinants.. Antimicrob. Agents Chemother. 50: 1896-1899 [Abstract] [Full Text]  
  • Pinho, M. D., Melo-Cristino, J., Ramirez, M., the Portuguese Group for the Study of Streptococca, (2006). Clonal Relationships between Invasive and Noninvasive Lancefield Group C and G Streptococci and emm-Specific Differences in Invasiveness.. J. Clin. Microbiol. 44: 841-846 [Abstract] [Full Text]  
  • Wescombe, P. A., Upton, M., Dierksen, K. P., Ragland, N. L., Sivabalan, S., Wirawan, R. E., Inglis, M. A., Moore, C. J., Walker, G. V., Chilcott, C. N., Jenkinson, H. F., Tagg, J. R. (2006). Production of the Lantibiotic Salivaricin A and Its Variants by Oral Streptococci and Use of a Specific Induction Assay To Detect Their Presence in Human Saliva. Appl. Environ. Microbiol. 72: 1459-1466 [Abstract] [Full Text]  
  • Johnson, D. R., Kaplan, E. L., VanGheem, A., Facklam, R. R., Beall, B. (2006). Characterization of group A streptococci (Streptococcus pyogenes): correlation of M-protein and emm-gene type with T-protein agglutination pattern and serum opacity factor. J Med Microbiol 55: 157-164 [Abstract] [Full Text]  
  • Tewodros, W., Kronvall, G. (2005). M Protein Gene (emm Type) Analysis of Group A Beta-Hemolytic Streptococci from Ethiopia Reveals Unique Patterns. J. Clin. Microbiol. 43: 4369-4376 [Abstract] [Full Text]  
  • Nightingale, K. K., Windham, K., Wiedmann, M. (2005). Evolution and Molecular Phylogeny of Listeria monocytogenes Isolated from Human and Animal Listeriosis Cases and Foods. J. Bacteriol. 187: 5537-5551 [Abstract] [Full Text]  
  • Bessen, D. E., Manoharan, A., Luo, F., Wertz, J. E., Robinson, D. A. (2005). Evolution of Transcription Regulatory Genes Is Linked to Niche Specialization in the Bacterial Pathogen Streptococcus pyogenes. J. Bacteriol. 187: 4163-4172 [Abstract] [Full Text]  

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrowReprints and Permissions
Right arrow Copyright Information
Right arrow Books from ASM Press
Right arrow MicrobeWorld
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by McGregor, K. F.
Right arrow Articles by Bessen, D. E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by McGregor, K. F.
Right arrow Articles by Bessen, D. E.