Previous Article | Next Article ![]()
Journal of Bacteriology, May 2005, p. 3311-3318, Vol. 187, No. 10
0021-9193/05/$08.00+0 doi:10.1128/JB.187.10.3311-3318.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
William W. Taylor,3
Donald E. Low,4
Allison McGeer,4 and
Malak Kotb1,2,5*
Departments of Molecular Sciences,1 Surgery,2 Molecular Resource Center, University of Tennessee Health Science Center,3 Research Center, Veterans Affairs Medical Center, Memphis, Tennessee,5 Mount Sinai Hospital and University of Toronto, Toronto, Ontario, Canada4
Received 12 December 2004/ Accepted 1 February 2005
|
|
|---|
|
|
|---|
While in the past, the remarkable prevalence, persistence, and virulence of M1T1 isolates were attributed to a number of individual genetic factors (34, 36, 42), it is likely that the unique features of this clone result from complex traits encoded by a number of interacting genetic features and controlled by complex regulatory networks. With the advent of sophisticated genomic tools, we aimed to identifyat the genomic levelthe unique bacterial genetic factors that distinguish this M1T1 clone from other M1 isolates and that might provide clues as to its prevalence, persistence, and virulence.
Unlike the M1T1 clonal strain, another closely related fully sequenced M1 strain, SF370, isolated from a wound infection (21, 43), has not been frequently isolated from severe invasive GAS infections (27) and has not shown the same pattern of prevalence and persistence seen in the M1T1 clonal strain. We took advantage of the high similarity, yet striking difference in epidemiology, between these two M1T1 strains and used differential microarray hybridization to identify unique genetic features of the clonal M1T1 strain without the need to sequence its entire genome. As expected, the majority of the differences were attributed to prophage sequences. The importance of prophage content in the diversification of various subclones of GAS M3 serotype has been recently demonstrated by Beres et al. (6). Here, we identified three distinct prophages integrated into the M1T1 genome, two of which are not found in the M1 SF370 strain; the third has two variants that distinguish two M1T1 lineages and that has likely emerged due to phage exchange between two distinct M serotypes. We also discovered that the genomes of these prophages are highly mosaic, with different regions being related to distinct GAS phages. Furthermore, we identified a highly conserved open reading frame (ORF) adjacent to the toxins (paratox; prx) in the majority of GAS prophages and found that allelic variants of paratox are in linkage disequilibrium with specific toxin genes. Based on these observations, we propose a model of recombination-induced toxin exchange among the GAS prophages.
|
|
|---|
Bacterial strains and culture conditions. Extensively characterized clonal M1T1 clinical isolates from invasive GAS infection cases were used in this study (12, 17, 32). One representative isolate, M1T1-6050, from the clonal M1T1 strain (12) was used in generating the genomic library and was compared, in the microarray experiments, to strain SF370 (ATCC 700294), isolated from an infected wound (21, 43). However, in certain studies, the microarray results (obtained from M1T1-6050 DNA) were confirmed by use of DNA from several isolates belonging to the same M1T1 clone (12). For simplification, M1T1-6050 is referred to as M1T1 throughout this article.
All GAS isolates were grown in Todd-Hewitt broth (Difco Laboratories) supplemented with 1.5% yeast extract (THY). Escherichia coli, in which the M1T1-6050 genomic library was generated, was grown in Luria-Bertani (LB) broth supplied with 50 µg/ml carbenicillin (Sigma).
Generation of the M1T1 GAS library and construction of microarrays. In collaboration with Lucigen Corporation (Middleton, WI), we generated a genomic library for the M1T1-6050 isolate. M1T1 chromosomal DNA was extracted by a modified phenol-chloroform method (11), randomly sheared to 1 to 3 kb, and then cloned in pSMART-LC (Lucigen). DH10B electrocompetent cells were transformed with the ligated vectors at Lucigen Corporation to produce the M1T1 library. Colonies (n = 6144) were picked manually and subcultured in 96-well plates. With an average GAS genome length of 1.9 x 106 bp and an average insert size of 2,000 bp, 6,144 clones provide 99.84% genome coverage, as calculated from the Poisson distribution (22).
The glass microarrays were manufactured in the Molecular Resource Center and the Vision Core Facility at the University of Tennessee. We used a MicroGridII microarrayer (BioRobotics, Genomic Solutions) to spot the probes (library PCR products) onto superamine glass slides (Telechem International Inc.).
Labeling, hybridization, and image analysis. Sheared chromosomal DNA from both M1 SF370 and M1T1-6050 strains was labeled by random priming with either Cy3 or Cy5 fluorescent nucleotides, as detailed in the supplemental Materials and Methods. Equal amounts of the labeled genomic DNA from both M1 SF370 and M1T1 strains were mixed and used to hybridize the unlabeled DNA probes on the microarray slides. The slides were dried then scanned by GenePix4000B scanner (Axon Instruments, Inc.). All steps were performed in the dark. The experiment was repeated twice, with three replicate microarrays each time. The scanned images were analyzed with the GenePixPro 4.0 software (Axon Instruments).
Probes that hybridized preferentially to labeled M1T1 DNA were chosen, and the corresponding clones were recovered from the master plates, amplified by the TempliPhi system (Amersham) and sequenced on ABI PRISM 3100 Genetic Analyzer (Applied Biosystems).
Sequence assembly, annotation, and bioinformatic analysis. The sequence of each probe was compared to the nonredundant GenBank database by use of BLASTN and BLASTX software (1). BLASTN and BLASTX results were parsed in independent files by PERL bioinformatics scripts (R.A.E., unpublished scripts). We assembled the probe sequences into larger fragments then into prophages using Phred, Phrap, and Consed sequence analysis software package (23) and the Vector NTI (VNTI) Suite (Informax Inc.). We closed the gaps and corrected low-quality sequences in the assembled fragments by additional PCR amplifications followed by primer extension sequencing. Finally, we used VNTI to assemble all fragments and additional sequences into prophages and to identify and annotate all ORFs.
For sequence alignment and phylogenetic analysis, we used the AlignX feature of VNTI, ClustalW (44), PHYLIP (20), and njplot (39). To investigate phage mosaicism, we used BLASTN rather than BLASTX because nucleotide sequence similarity is a more relevant indicator of phage-phage relationships, especially in the case of closely related proteins, it shows the similarity of noncoding areas, and it is less affected by frameshift mutations or by accidental sequencing errors.
Confirmation of phage excision. To confirm phage excision from GAS chromosome, we performed PCRs using primer pairs that flank the phage attachment sites (attP), the bacterial attachment sites (attB), and the attL sites. In an integrated prophage, attP primers are divergent and will be unable to yield a single PCR product, while the primers that flank attB would have to amplify the whole prophage (>30 kb). Only attL-flanking primers are expected to yield a product in an integrated prophage. Conversely, in a circularized phage, only those primers flanking attP and attB are expected to give a product. Therefore, a positive result from the attP primer pair was taken as evidence of the phage's ability to be excised from the genome; excision was further confirmed by amplification of an appropriate-sized product when attB primer pairs were used, indicating that the phage had been excised from the chromosome restoring the original boundaries of the bacterial attB sequence.
Phage nomenclature. In this article, we follow current convention to designate GAS prophage by names consisting of the bacterial host's name followed by a serial number that reflects the prophage chromosomal location, in clockwise order (4, 10). Since some prophages of strains SF370 and MGAS8232 in the GenBank database are given discrepant names, we list the discrepancies in Table 1. As for the phages identified in this study, we called themaccording to current conventionby the strain name (M1T1) followed by the letters X, Y, and Z in the clockwise order of their chromosomal locations. To make prophages M1T1.X, M1T1.Y, and M1T1.Z easier to discuss, we also designated them SPhinX, MemPhiS, and PhiRamid, respectively.
|
View this table: [in a new window] |
TABLE 1. Discrepancies in prophage names
|
Nucleotide sequence accession numbers. The sequence data from this study have been submitted to GenBank under accession numbers AY616023 and AY621076.
|
|
|---|
![]() View larger version (29K): [in a new window] |
FIG. 1. Summary of differential hybridization results. Distribution of best BLASTN hits for the sequences that hybridized preferentially with M1T1 but not with SF370 DNA (raw data are provided in Table S1).
|
Mosaicism of the three M1T1 phages.
As mentioned above, SPhinX and PhiRamid bear very little sequence similarity to SF370 prophages, with the exception of very few areas that are highly conserved in most GAS phages. When the genomes of these two M1T1-specific phages were compared to the GenBank sequences, a striking genetic mosaicism was seen (Fig. 2). For example, SPhinX has an area with substantial similarity (99% identity along
23 kb) to Phi315.5 in the MGAS315 M3 strain (Fig. 2A, segment Xd). This area of high similarity between the two SpeA-encoding prophages includes most of the phage structural genes (head and tail morphogenesis), as well as the lysis cassette and the speA virulence gene. However, whereas SPhinX carries the speA2 allele, Phi315.5 carries speA3. The remainder of the SPhinX genome is rather different from that of Phi315.5: its replication module is mostly similar to Phi8232.5 (Fig. 2A, segment Xc), and its lysogeny module is mostly similar to SSA-carrying Phi315.2 and to Phi8232.3, which encodes SpeL and SpeM (Fig. 2A, segments Xa and Xe).
![]() View larger version (13K): [in a new window] |
FIG. 2. Mosaic nature of M1T1 prophages. The diagram shows the patterns and extent of similarity between different segments of SPhinX (A), MemPhiS (B), and PhiRamid (C) and their closest homologs among GAS prophages. Best BLASTN hits are shown below or above each phage segment, andin some casesthe percentage of nucleotide identity is indicated. White boxes represent sequences with no BLASTN hits.
|
As for MemPhiS, the first 5-kb segment of this prophage is virtually identical to MF3-carrying Phi370.3 (>99%) and thus was not picked by differential hybridization but rather by sequencing. Many regions in the remaining
30 kb of MemPhiS are highly similar to regions in Phi370.3 (90% to 98% identity at the nucleotide level); however, this phage is more similar (>99%) to MF4-carrying Phi315.3 (Fig. 2B). Interestingly, the majority of the M1T1 isolates tested are mf3+/mf4, whereas much fewer M1T1 isolates are mf3/mf4+; the gene products of mf3 and mf4 are only 20% identical at the amino acid level.
In all three prophages, there were areas with no or very poor BLASTN hits, e.g., parts of the lysogeny module of SPhinX and PhiRamid and parts of PhiRamid's lysogenic conversion module (white boxes in Fig. 2). Some of these unique islands are AT rich, similar to repeats found in the M protein and the SOF-encoding genes. It is tempting to speculate that these islands, which are flanked by highly conserved genes, result from interphage recombination, but this remains to be investigated.
Excision of SPhinX, PhiRamid, and MemPhiS from the GAS chromosome. Based on the above sequence information, we designed PCR primers that would only generate products if the phages were circularized and we found that at least a proportion of each of the three M1T1 phage populations is present in circular form, i.e., is excised from the chromosome (Fig. 3). The sequence of the PCR products that encompass the attachment sites (attP) of all three circularized phages not only confirmed their excision but also provided direct evidence that the putative phage attachment sites and their core repeats were as predicted by sequence similarity of the redundant prophage ends.
![]() View larger version (47K): [in a new window] |
FIG. 3. PCRs showing phage excision and integration. (A, upper part) Map of the different genes in the lysis and the lysogeny modules of GAS prophages. (A, lower part) Map of the same genes' relative positions when the phage is circularized. Genes are not drawn to scale. Positions of PCR primers are shown by small black arrows (a, b, c, and d). (B) PCRs show the presence of each phage (SPhinX, MemPhiS, and PhiRamid) in both attached and excised forms. All PCR products were sequenced and their sequences validated.
|
We compared the nucleotide sequences of the three lysogenic conversion modules of SPhinX, MemPhiS, and PhiRamid and identified a highly conserved ORF, lacking a signal peptide, located between the toxin gene and the phage attachment site (Fig. 4A), andbased on its locationwe called it paratox. Paratox was found in 18 out of 24 GAS prophages in strains SF370, MGAS8232, MGAS315, and SSI-1, and in all cases, it was located adjacent to a toxin gene. Paratox homologs were also found in some phages in Streptococcus agalactiae and Streptococcus thermophilus. GAS prophages that do not have virulence genes (e.g., Phi315.1, PhiSPsP6, and Phi370.4) lack paratox-like genes. Curiously, no paratox homologs were identified in SpeC-carrying Phi370.1 and Phi8232.1 or in SpeH- and SpeI-carrying Phi370.2; none of these phages is in the M1T1 strain.
![]() View larger version (53K): [in a new window] |
FIG. 4. Paratox: a highly conserved ORF in M1T1 prophages. (A) A comparison between the lysogenic conversion modules and attachment sites of the three M1T1 prophages shows a highly conserved ORF that best matches a hypothetical phage protein located between each toxin and the phage attachment site. We named this hypothetical protein paratox (prx). Shaded areas indicate nucleotide similarity, and the percentage nucleotide identity is given. (B) Alignment of paratox protein alleles shows highly conserved amino acid sequence (represented by dots). Representative motifs linked to particular toxins or to phage attachment sites are boxed. All sequences are extracted from GenBank; in cases where the prx sequences were not annotated as ORFs, we picked them based on their similarity to the annotated ones. Each Prx will be referred to as (Prx_tox_Phi#), where tox is the name of the adjacent toxin and Phi# is the phage name and number (e.g., Prx_SpeA2_M1T1.X is the product of the paratox gene adjacent to SpeA2 in Phi M1T1.X, alias SPhinX). Serial numbers (1 to 11) were given to the distinct paratox alleles shown.
|
The conservation of the prx gene and its linkage disequilibrium to the tox gene also suggest that prx may be one of two hot spots of recombination (arms) flanking the virulence genes and promoting their dissemination between prophages by recombination. This hypothesis can only be valid if the sequence flanking the tox gene, opposite to prx (Fig. 5), is conserved and is at least in partial linkage disequilibrium with tox. To investigate this possibility and identify the putative second hot spot of recombination, we aligned the predicted amino acid sequences of the hol, lys, and hylP gene products in all known GAS prophages (Fig. S9C to E). The phylogenetic analyses confirmed their high conservation among different prophages and thus suggested that any of them could be the second recombination hot spot (Fig. 5). In addition, a unique feature of the three M1T1 prophages is that unlike all published sequenced GAS strains that possess two or more highly similar alleles of each of the Lys, Hol, and HylP proteins carried on different phages per strain, the M1T1 does not show this redundancy. Instead, each of the three M1T1 phages has its unique lysin, holin, or hyaluronidase. Altogether, this strain contains two nonhomologous holins, three weakly similar lysins, and two divergent hyaluronidases (Fig. S9).
![]() View larger version (12K): [in a new window] |
FIG. 5. Putative model for toxin exchange between phages. Possible scenarios that may contribute to toxin exchange between different prophages by recombination are shown. Two recombination hot spots are shown on both sides of the toxin genes: one of them is the prx gene, and the other may be either lys, hol, or hylP.
|
|
|
|---|
Overall, the majority of the differences between M1T1 and SF370 are phage-related sequences. Interestingly, 78% of the phage-related sequences unique to M1T1 are shared by the two sequenced M3 strains MGAS315 (5) and SSI-1 (37). Our findings are supported by an earlier report that an invasive M1 subclone differs from other members of the same M1 serotype by two prophages, T13 and T14 (14). Although no sequence was provided for T13 and T14, it is likely that they are closely related to SPhinX and PhiRamid, which distinguish M1T1 from SF370. These two prophages carry the speA2 and sda1 genes; homologs of these genes, which, respectively, encode a potent superantigen and a DNase, are also present in the M3 strains (speA3 and sdn). SpeA is a well-characterized superantigen that plays a pivotal role in STSS pathogenesis (38), and we recently demonstrated the DNase activity of Sda1 and showed that its unique carboxy terminus potentiates its nuclease activity (3). Inasmuch as M1T1 and M3 strains have been frequently isolated from severe invasive streptococcal infections, it is reasonable to suspect that these prophages and/or the toxins they encode may be conferring an added virulence on these strains. Despite the similarities between the M1T1 and M3 prophages, important differences were found, including differences in the attachment sites of phage pairs with similar structural genes (SphinX and Phi315.5, as well as PhiRamid and Phi315.6), unique integrase genes, and unique modules found only in the M1T1 phages (Fig. 2).
Genetic mosaicism in M1T1 prophages.
Analysis of M1T1 phage genomes suggests that they have diversified by exchanging information and shuffling genetic modules in a pattern that makes each M1T1 prophage a unique entity, sharing blocks of sequences with different prophages but also possessing unique sequences with no known homologs in the current databases. This phenomenon, also known as genetic mosaicism, is a hallmark of tailed phages (24), to which the streptococcal phages belong, and is likely to increase phage fitness and to enhance the dissemination of the genes located within the shuffled modules (25, 26). In addition to the expected similarities of M1T1 prophages to other GAS prophages, we identified sequences within the M1T1 phages that were best matched to phage-related sequences in other bacterial species. For example, the integrase gene of PhiRamid was mostly similar (53%) to phage
Sa2 of S. agalactiae. These observations lead us to suggest that the newly identified M1T1 prophages and/or related phages may have taken habitat in other GAS strains, as well as in different bacterial species, where they may have plucked certain sequences or modules and left others behind.
Role of prophage in M1T1 GAS evolution and subclone emergence. The recent completion of several GAS genomes demonstrated how bacteriophages account for major differences between the M serotypes (4, 5, 37), andeven within the same serotypesubclones emerge that have different prophage contents (6). This notion is illustrated in this study by the identification of SPhinX and PhiRamid that distinguish two subclones of the M1 serotype, M1T1 and SF370.
A third prophage, MF4-encoding MemPhiS identified in few M1T1 isolates, signalizes the presence of two lineages of the M1T1 subclone: a major mf3+/mf4 and a minor mf3/mf4+ lineage. MemPhiS is more similar to mf4-carrying Phi315.3, found in the M3 strains, than to mf3-carrying Phi370.3, found in the SF370 strain. All three phages belong to a family of r1t-like phages (18) that are the most highly conserved prophages in GAS, as each GAS strain sequenced so far has an r1t-like prophage that is inserted between the hlpA and cadA genes. The fact that MemPhiS and Phi315.3 are virtually identical (99% nucleotide identity) suggests that they share a recent common ancestor or that one was derived from the other. To our knowledge, this is the first example of a virtually identical prophage present in two different M serotypes.
The mf4 gene was first detected in the genome of MGAS315, anduntil this reporthas only been found in M3 strains. However, Beres et al. showed recently that only 4/255 M3 isolates screened lacked the mf4 gene (6), and it would be interesting to test whether these 4 strains are carrying mf3 as our data suggest that a total or partial prophage exchange event occurred between the M1T1 and M3 strains. The facts that both M1T1 and M3 strains studied were isolated from invasive GAS infections and were from Canadian patients living in the Ontario area (6) make this exchange a likely scenario.
Role of prophage in harboring, disseminating, and remodeling GAS virulence factors. Another question addressed in this study is how phages acquire and exchange virulence genes. In the case of streptococcal phage-encoded toxins, it is believed that these genes were acquired by inaccurate phage excision in a bacterial host with a lower G+C content (10). Data from recent environmental phage genomes show that some phage-encoded toxins are found in marine phages isolated from distant closed habitats (Forrest Rohwer, personal communication). While it is not possible to know exactly how an ancestral prophage acquired a bacterial toxin with a secretion signal, evidence from various phage and bacterial genomes suggests that the toxins are mostly spread by horizontal gene transfer between different prophages. We believe that the exchange of toxins not only helps their spread in nature but also promotes their diversification as in the case of streptodornases (3).
The finding of highly conserved sequences on both sides of the toxin genes in the phages reported here supports the homologous recombination model for toxin mobilization between various phages (7, 10). Among sequences flanking the toxins was a highly conserved ORF that we named paratox (prx). Despite their high similarity, paratox proteins could be classified into alleles, and we found linkage disequilibrium between particular paratox alleles and specific toxin genes, suggesting that the prx and toxin genes are inherited and disseminated as one cassette. Interestingly, whereas most of the paratox sequence is in linkage disequilibrium to the toxin gene, its C-terminal sequence appears to be in linkage with the phage attachment site (boxes in Fig. 4B), suggesting that recombination takes place within the paratox sequence. As more GAS genomes become available, and more paratox alleles are identified, this notion may be further validated.
Conclusion Our goal was to identifyat the genomic levelunique features that distinguish the clonal M1T1 strain from the closely related SF370 strain. We identified three prophages in M1T1 that contribute largely to its uniqueness. The finding that prophage may play a role in subclone diversification, and the fact that the prophage cassettes can be shared between the M1T1 and M3 serotypes, brings into question the validity of the current GAS classification system, particularly when attempts are made to associate certain serotypes with specific clinical manifestations of GAS infections. In our opinion, the M serotype designation is no longer sufficient or clinically useful. The advent of genomic tools and the completion of several GAS genome sequences have unraveled the extent of the horizontal transfer of mobile genetic elements (IS and phages) among the various strains, and it is perhaps incumbent upon us to explore a new classification schema that better represents the basis for GAS virulence and their involvement in specific diseases.
This work was supported by grant AI40198-07 from the National Institute of Allergy and Infectious Diseases, National Institutes of Health (to M.K.), by the Research and Development Office, Medical Research Service, Department of Veterans Affairs (Merit Award to M.K.), and by a UTHSC Center of Excellence in Genomics and Bioinformatics grant (M.K., R.A.E., and R.K.A.). An abstract including parts of this study was awarded the ASM student travel grant (R.K.A.) in the ASM Functional Genomics and Bioinformatics Conference, 2004, Portland, OR.
Supplemental material for this paper may be found at http://jb.asm.org/. ![]()
Present address: Fellowship for Interpretation of Genomes and San Diego State University, San Diego, CA 92182. ![]()
|
|
|---|
vß3 and
IIbß3. Proc. Natl. Acad. Sci. USA 96:242-247.
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»