Previous Article | Next Article ![]()
Journal of Bacteriology, April 2007, p. 3166-3175, Vol. 189, No. 8
0021-9193/07/$08.00+0 doi:10.1128/JB.01808-06
Copyright © 2007, American Society for Microbiology. All Rights Reserved.
,
Joao M. Alves,2,3,
Todd Kitten,1,2,3
Arunsri Brown,1,
Zhenming Chen,2,3,¶
Luiz S. Ozaki,2,3
Patricio Manque,2,3
Xiuchun Ge,1
Myrna G. Serrano,2,3
Daniela Puiu,2,||
Stephanie Hendricks,3
Yingping Wang,2,3
Michael D. Chaplin,2
Doruk Akan,2,
Sehmi Paik,1,3,
Darrell L. Peterson,4
Francis L. Macrina,1,2,3* and
Gregory A. Buck2,3*
Philips Institute of Oral and Craniofacial Molecular Biology, Virginia Commonwealth University, Richmond, Virginia 23298-0566,1 Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, Virginia 23284-2030,2 Department of Microbiology and Immunology, Virginia Commonwealth University, Richmond, Virginia 23298-0678,3 Department of Biochemistry and Molecular Biophysics, Virginia Commonwealth University, Richmond, Virginia 23298-06144
Received 30 November 2006/ Accepted 29 January 2007
|
|
|---|
|
|
|---|
-amylase (27). Once bound, S. sanguinis serves as a tether for the attachment of other oral microorganisms that colonize the tooth surface, form dental plaque, and contribute to development of caries and periodontal disease (46). S. sanguinis may also interfere with colonization of the tooth by Streptococcus mutans, the primary species associated with dental caries (16), and its presence therefore may also be beneficial for oral health. The viridans streptococci are the most common cause of native-valve infective endocarditis, and S. sanguinis is the viridans streptococcus most commonly implicated in this disease (66). S. sanguinis and other viridans streptococci are also emerging as important bloodstream pathogens in infections that threaten neutropenic patients (1), and these infections may be complicated by an increasing frequency of antibiotic resistance (71). The reasons underlying this previously unrecognized virulence are unknown, and antibiotic resistance is disquieting because viridans streptococci, including S. sanguinis, have been classified historically as penicillin sensitive and for many years were believed to be unable to become resistant to ß-lactam antibiotics.
Here, we describe the sequence and an analysis of the genome of S. sanguinis strain SK36, which was originally isolated from human dental plaque (43). Analysis of the predicted proteins yielded new insights into potential pathogenicity and virulence factors in this important bacterium, which allowed comparison with virulence mechanisms in other streptococci. Furthermore, about 28% of the predicted proteins were confirmed with high confidence by mass spectrometry (MS).
|
|
|---|
Genome sequencing and annotation.
The genome was sequenced using a modified whole-genome shotgun strategy, as previously described (98). In short, two shotgun libraries (1- to 2-kb and 2- to 4-kb inserts) and one BAC library (
500 clones, 25- to 100-kb inserts) were constructed, and approximately 74,000 sequences were generated (
15-fold coverage of the genome) by using an ABI 3700 96-lane capillary DNA sequencer (Applied Biosystems). The genomic sequence was assembled as previously described (98). Gaps were closed by genome walking (Clontech), alignment with BAC clones, long-distance PCR, and multiplex PCR (89). All remaining low-quality sequence regions were amplified and resequenced for finishing. About 5,000 sequences were added during gap closing and finishing. Genome annotation was performed automatically essentially as previously described (98). Gene predictions were based on Glimmer (77), database searches, and manual verification in Apollo (50). rRNA boundaries were set based on predicted structural criteria (15).
HGT analyses. To select candidates for horizontal gene transfer (HGT), the phyletic patterns of gene distribution were analyzed. First, S. sanguinis proteins were compared to the NCBI nonredundant protein database using BLASTP. Significant matches (E < 1e-6) were analyzed to find genes without streptococcal sequences among the top six species matching the S. sanguinis protein. The same analysis was performed with Escherichia coli K-12, considering Salmonella and Yersinia the "same" genus (these genera were chosen as the genera that were closest phylogenetically to E. coli since no other species of Escherichia have been sequenced). This analysis overestimated the number of HGT candidates in E. coli due to the narrow sampling of genetic diversity in the genus compared to the broad sampling available for streptococci.
Proteomic analysis of S. sanguinis.
Total protein was extracted from S. sanguinis grown overnight in brain heart infusion broth. Cells were harvested by centrifugation, washed twice in ice-cold phosphate-buffered saline, and suspended in 20 mM morpholinepropanesulfonic acid (MOPS)-62.5 mM NaCl-0.5 mM MgSO4 (pH 7.8) with a protease inhibitor cocktail (Sigma-Aldrich). The cells were mechanically disrupted with an FP120 FastPrep cell disruptor (Bio 101 Systems, Qbiogen, Inc.) by using three 30-s cycles of homogenization at the maximum speed with 1-min intervals between cycles in ice. The suspension was centrifuged (5,000 x g for 15 min at 4°C) to remove unbroken cells and large cellular debris. The supernatant was suspended in solubilization buffer as previously described (68) and was precipitated with a 2D clean-up kit (GE Healthcare). After reduction with dithiothreitol and iodoacetamide alkylation, proteins (
75 µg) were digested overnight with trypsin. The resulting tryptic peptides were desalted using C8 cartridges (Michrom BioResources) and were subjected to two-dimensional nano liquid chromatography-MS/MS analyses with a Michrom BioResources Paradigm MS4 multidimensional separation module, a Michrom NanoTrap platform, and an LCQ Deca XP Plus ion trap mass spectrometer. The mass spectrometer was operated in the data-dependent mode, and the four most abundant ions in each MS spectrum were selected and fragmented to produce tandem mass spectra. The MS/MS spectra were recorded in the profile mode. Proteins were identified by searching the MS/MS spectra against our S. sanguinis database using Bioworks v3.2. Peptide and protein hits were scored and ranked using the new probability-based scoring algorithm incorporated in Bioworks v3.2. Only peptides identified as possessing fully tryptic termini with cross-correlation scores greater than 1.9 for singly charged peptides, 2.3 for doubly charged peptides, and 3.75 for triply charged peptides were used for peptide identification. In addition, the delta-correlation scores had to be greater than 0.1, and for increased stringency, a protein was accepted only if its probability score was <0.0001.
Nucleotide sequence accession number. The S. sanguinis SK36 genome sequence has been deposited in the GenBank database under accession no. CP000387.
|
|
|---|
1.2 Mbp downstream from the origin of replication (Fig. 1). The G+C content of the genome is 43.40%, which is higher than the G+C content of any of the 21 other completed streptococcal genomes (35.62 to 39.72%) (Table 1). For protein-encoding genes, the G+C contents are 53.55, 35.46, and 44.35% for positions 1, 2, and 3, respectively. Based on the relationship between the G+C contents of whole genomes and the G+C contents of position 3 of coding sequences, which were recently determined for 232 eubacterial genomes (93), the expected value for position 3 in S. sanguinis is 42.5%, in good agreement with the observed value. This observation suggests that unlike the findings for Lactobacillus bulgaricus, the higher overall G+C content of S. sanguinis is not due to an ongoing process of compositional change or to a different relationship of whole-genome and third-position G+C values. There are four rRNA operons containing the 5S, 16S, and 23S rRNA genes, which is less than the number in most other streptococci (Table 1), despite the larger genome size and in contrast to a reported correlation between the numbers of rRNA and tRNA genes and the genome sizes in the Firmicutes (93). The 61 predicted tRNA genes encode all 20 amino acids, but wobble rules are required for several abundant codons (www.sanguinis.mic.vcu.edu/supplemental.htm). Most tRNA genes are clustered near the rRNA operons; i.e., 48 of 61 of these genes were less than 1 kb from an rRNA operon (Fig. 1), as in Streptococcus pneumoniae (88). |
View this table: [in a new window] |
TABLE 1. Comparison of S. sanguinis SK36 genome with other streptococcal genomes
|
![]() View larger version (45K): [in a new window] |
FIG. 1. Circular S. sanguinis SK36 genome map. Starting from the outside, the circles show (i) the genome positions (in base pairs) starting from the origin of replication (ORI); (ii and iii) predicted coding regions on the two strands (different colors are used for clarity); (iv) G+C content (in 1-kb windows); (v and vi) rRNA clusters on the two strands; and (vii and viii) tRNA on the two strands.
|
The S. sanguinis SK36 genome was compared with other genomes to identify the proteins that are conserved among streptococci. Figure 2 shows the homologous proteins that are shared by S. sanguinis, S. mutans, and S. pneumoniae. This analysis indicated that S. sanguinis shares 23 more proteins with S. mutans than with S. pneumoniae and that the latter two species share only 19 proteins not present in S. sanguinis. Previous analyses based on rRNA (41) and our more broadly based phylogenetic analysis confirmed that S. sanguinis is more closely related to S. pneumoniae than to S. mutans, suggesting that the similarity with S. mutans reflects the shared oral niche of these two species. The proteins shared by only S. sanguinis and S. mutans include 60 proteins that are hypothetical or have unknown functions and, interestingly, 34 putative transcriptional regulators. All proteins in the S. sanguinis genome were functionally categorized and compared (Fig. 3) essentially as previously described (98).
![]() View larger version (20K): [in a new window] |
FIG. 2. In silico comparisons of streptococci. The protein sets of S. sanguinis SK36, S. mutans UA159, and S. pneumoniae TIGR4 were compared. The numbers under and above the species names indicate the total numbers of genes; the numbers in the intersections indicate the numbers of genes shared by two or three species.
|
![]() View larger version (41K): [in a new window] |
FIG. 3. COG classification of the S. sanguinis SK36 genome and comparison with other microbial genomes. The numbers of genes of eight species were compared based on the functional classification in the COG database. Ss, S. sanguinis SK36; Spy, S. pyogenes M1GAS; Sm, S. mutans UA159; Sp, S. pneumoniae R6; Sa, S. agalactiae NEM316; St, S. thermophilus CNZR1066; Ef, Enterococcus faecalis V583; Ll, Lactococcus lactis IL-1403. The functional categories are indicated as follows: A, amino acid transport and metabolism; B, carbohydrate transport and metabolism; C, cell division and chromosome partitioning; D, cell envelope biogenesis, outer membrane; E, cell motility and secretion; F, coenzyme metabolism; G, defense mechanisms; H, DNA replication, recombination, and repair; I, energy production and conversion; J, unknown function; K, general function prediction; L, inorganic ion transport and metabolism; M, lipid metabolism; N, nucleotide transport and metabolism; O, posttranslational modification, protein turnover, chaperones; P, secondary metabolite biosynthesis, transport, and catabolism; Q, signal transduction mechanisms; R, transcription; S, translation, ribosomal structure, and biogenesis; T, other.
|
Similar to S. mutans (2) and other streptococci, S. sanguinis has an incomplete citrate cycle and contains only the enzymes to convert oxaloacetate into 2-oxoglutarate. Although clearly incapable of direct ATP production, this pathway fragment likely generates intermediates in the synthesis of aspartate and glutamate.
Our analysis suggests that S. sanguinis has a robust biosynthetic capacity. All key enzymes for gluconeogenesis are present. This bacterium has both pyruvate-phosphate dikinase (EC 2.7.9.1) (encoded by SSA_1053) found in other streptococci and phosphoenolpyruvate synthase (EC 2.7.9.2) (encoded by SSA_1012 and SSA_1016) that is absent in other streptococci. There is also a Firmicutes-specific fructose-1,6-bisphosphatase (EC 3.1.3.11) (encoded by SSA_1056) that is present in Streptococcus agalactiae but not in S. pneumoniae, S. mutans, Streptococcus pyogenes, or Streptococcus thermophilus. Phyletic pattern analyses suggested that the genes for these enzymes were acquired by HGT (www.sanguinis.mic.vcu.edu/supplemental.htm). Similarly, enzymes in the pentose phosphate pathway and enzymes in the purine and pyrimidine pathways, which are required for de novo synthesis of nucleotides with the possible exception of dTTP, seem to be available. Enzymes necessary for converting glutamate and glutamine to intermediates in purine and pyrimidine synthesis are also present. However, as in S. mutans (2), the gene for nucleoside diphosphate kinase (EC 2.7.4.6), which phosphorylates dTDP to dTTP, could not be identified. Since these enzymes are highly conserved in other streptococci, it is unlikely that we missed identifying their genes, assuming that they are derived from common progenitors.
S. sanguinis seems to have the ability to synthesize de novo all essential amino acids except the branched amino acids (leucine, isoleucine, and valine), lysine, and tryptophan (www.sanguinis.mic.vcu.edu/supplemental.htm). This conclusion is in agreement with our finding that S. sanguinis cannot grow in a semidefined biofilm medium (52) if supplemental amino acids are not included (data not shown). Synthesis of asparagine likely relies on a two-step process in which aspartate is bound to tRNAAsn by a nondiscriminating Asp-tRNA synthetase, followed by conversion of the aspartate to asparagine via a three-subunit aspartyl/glutamyl-tRNA amidotransferase, as has been shown for Deinococcus radiodurans (62). The latter enzyme is probably also responsible for conversion of Glu-tRNAGln to Gln-tRNAGln, thus explaining the lack of a gene encoding glutaminyl-tRNA synthetase in the genome (72). As noted above, enzymes for gluconeogenesis are present and could permit the bacterium to convert some amino acids (e.g., serine) into fructose-6-phosphate, an entry point of the pentose phosphate pathway. In this way, amino acids can be converted into the precursors of nucleotide biosynthesis. Marri et al. (58) recently reported that among the streptococci, S. mutans is unique in possessing the genes responsible for biosynthesis of histidine and that S. pyogenes is unique in its apparent ability to convert histidine to glutamate. S. sanguinis possesses the genes for both of these processes.
Lipid biosynthesis apparently follows the classical bacterial type II fatty acid synthase complex pathway (34). As shown previously for S. pneumoniae (33, 57), S. sanguinis encodes the enoyl-(acyl-carrier protein) reductase (EC 1.3.1.9) FabK instead of the widespread and conserved FabI type enzyme of other bacteria and plants. The FabK enzyme of S. pneumoniae is less sensitive to inhibition by the antimicrobial triclosan than FabI is (33, 57). Therefore, S. sanguinis is probably more resistant than FabI-containing bacteria to inhibition of lipid biosynthesis by the triclosan used in some toothpastes. Fatty acids can be generated from amino acids since enzymes needed for the conversion of some amino acids (e.g., serine) into acetyl coenzyme A are present (www.sanguinis.mic.vcu.edu/supplemental.htm).
As expected, the S. sanguinis genome contains the genes required for cell wall sugar, peptidoglycan, and teichoic acid biosynthesis and degradation (www.sanguinis.mic.vcu.edu/supplemental.htm). Single copies of the genes encoding homologs of the S. mutans signal recognition particle components Ffh, FtsY, and small cytoplasmic RNA are present in S. sanguinis, as are single copies of the genes encoding the secretion components YidC1, YidC2, YajC, SecA, and SecYEG (31).
HGT.
In contrast to S. pneumoniae, in which
5% of the genome is composed of insertion sequences (IS) (88), we found only two apparently functional IS elements (SSA_0265 and SSA_0266; SSA_1361 and SSA_1362) in S. sanguinis. These elements are flanked by 4-bp direct repeats and are
80% identical at the nucleotide level to IS3 elements flanked by 3-bp repeats in S. mutans (55). Neither IS interrupts a known gene or open reading frame (ORF). Other evidence of transposable elements includes remnants of IS elements (SSA_1477 to SSA_1479 and SSA_0732) and a truncated transposase (SSA_2029). No intact prophages were found, although some apparent remnants (SSA_0235, SSA_2032, and SSA_2295 encoding an integrase/recombinase; SSA_2383 encoding a prophage maintenance system killer protein; and SSA_2282 encoding a phage infection protein) are present (www.sanguinis.mic.vcu.edu/supplemental.htm). No evidence of the presence of integrons was found. Homologs of the dpnM, dpnA, and dpnB genes of S. pneumoniae encoding the DpnII restriction-modification system are present in the S. sanguinis genome (SSA_1716 to SSA_1718). This system reduces the efficiency of HGT by phage infection, conjugative transfer, and transformation by plasmid (but not chromosomal) DNA (47). We did not find genes for the R.StsI and M.StsI components previously found in S. sanguinis 54 (44).
In spite of the relative paucity of transposon- and phage-related genes, at least 270 S. sanguinis genes (12% of the genes) were identified as candidates for HGT by observing the phyletic pattern of gene distribution (www.sanguinis.mic.vcu.edu/supplemental.htm) (see Materials and Methods). The apparent lack of phage genes and conjugative transposable elements suggests that transformation is the predominant method by which HGT occurs in S. sanguinis. Like certain other streptococci, S. sanguinis is naturally competent for transformation (25). In S. pneumoniae, 22 proteins necessary for chromosomal transformation have been identified (70). We found that 20 of these proteins have apparent orthologs in S. sanguinis (www.sanguinis.mic.vcu.edu/supplemental.htm). Neither ComW, an 80-amino-acid protein which stabilizes and activates the alternative sigma factor ComX (84) and for which there are no database matches in any other bacterium in the GenBank database, nor ComB, which functions with ComA to cleave and export competence-stimulating peptide (CSP), was identified. The SSA_1100 product exhibits similarity to ComA. However, the best matches for SSA_1100 in the GenBank database were matches to genes encoding transporters for RTX-type toxins from gram-negative bacteria (94). Since the adjacent gene encodes a putative RTX toxin, it appears that this protein transports the toxin rather than CSP. Therefore, it appears that ComA and ComB are not present in S. sanguinis. This absence may be related to the previous observation that ComC, the CSP precursor in S. sanguinis, is unique among all 125 ComC sequences from 13 streptococcal species in the GenBank database in that it lacks a double-glycine cleavage site (32). This unique cleavage site could be paired with unique proteins for processing and export.
One 70-kb cluster of 68 HGT candidates (SSA_0463 to SSA_0541) encodes an anaerobic cobalamin (vitamin B12) biosynthetic (cob) pathway, as well as propanediol utilization (pdu) and ethanolamine utilization (eut) pathways (Fig. 4; see Table S2 in the supplemental material). Many of the proteins in this cluster were identified by MS, proving that these genes are expressed.
|
View larger version (5K): [in a new window] |
FIG. 4. Schematic map of the 70-kb HGT region for vitamin B12 biosynthesis and related pathways. The colors indicate genes in different pathways based on homology with Salmonella, as follows: red, cob; blue, pdu; black, eut; gray, not predicted to be part of any of these three pathways; white, genes flanking the transferred region.
|
Cobalamin-dependent utilization of 1,2-propanediol via the pdu pathway plays an important role in Salmonella enterica serovar Typhimurium infection (20), and the pdu genes are correlated with cobalamin biosynthetic genes in terms of both location and coregulation. The S. enterica serovar Typhimurium pdu pathway contains 23 genes for the coenzyme B12-dependent catabolism of 1,2-propanediol (12). S. sanguinis has all of these genes except pduM and pduS, which encode proteins with unknown functions, and pduN, which encodes polyhedral bodies that may not be directly related to the catabolism of 1,2-propanediol (12) (see Table S2 in the supplemental material).
The eut pathway in S. enterica serovar Typhimurium is required for utilization of ethanolamine as a carbon and nitrogen source (75). Only 4 (eutB, eutC, eutD, and eutE) of the 17 genes in the S. enterica serovar Typhimurium eut operon have been correlated directly with an enzymatic activity known to be required for ethanolamine utilization (79). Three of these four genes, eutB (SSA_0519), eutC (SSA_0520), and eutE (SSA_0523), have homologs in S. sanguinis. eutD encodes a protein with phosphotransacetylase activity (14) and exhibits 40% identity with the S. sanguinis SSA_1207 ORF, which is annotated as phosphate acetyltransferase ORF. A two-component system (SSA_0516 and SSA_0517) that may regulate ethanolamine utilization in response to environmental factors is upstream of eutA. Since ethanolamine and propanediol sources in the environment seem largely man-made (e.g., toothpaste, mouthwash, and antifreeze) and their utilization is dependent on vitamin B12, it is interesting to speculate that this large
70-kb gene cluster may have been selected in S. sanguinis by exposure to these man-made products.
Although very few of these cobalamin-related genes are present in previously published streptococcal genomes, many are present in other oral pathogens, including Porphyromonas gingivalis, Treponema denticola, and Fusobacterium nucleatum (see Table S2 in the supplemental material). Our analyses suggest that the 70-kb cluster of HGT genes has an origin similar to the origin of orthologs in Listeria (www.sanguinis.mic.vcu.edu/supplemental.htm), but a more in-depth phylogenetic analysis involving more prokaryotic genomes is necessary to confirm its origin.
Two small discrete blocks of HGT candidate genes (SSA_1012 to SSA_1017 and SSA_1053 to SSA_1056) contain three genes involved in gluconeogenesis. The two genes in the second block (SSA_1053 and SSA_1056), encoding EC 2.7.9.1 and EC 3.1.3.11, are sufficient, in combination with other apparently native genes, to enable gluconeogenesis. These two genes are also found in S. agalactiae, theoretically enabling gluconeogenesis in this organism, while all other streptococcal genomes that have been sequenced seem to lack the complete set of genes required for gluconeogenesis. The results of our analysis (see Materials and Methods) are consistent with the hypothesis that these genes were transferred by HGT to these streptococci from other bacteria belonging to the phylum Firmicutes (www.sanguinis.mic.vcu.edu/supplemental.htm).
Putative virulence factors and adhesins. Several proteins potentially relevant to adhesion in the oral cavity or to virulence in invasive disease were identified in the S. sanguinis genome (see Table S3 in the supplemental material). Perhaps the most surprising is the protein encoded by SSA_1099 (Stx), which exhibits homology to RTX-type toxins in gram-negative bacteria (94). To our knowledge, this is the first occurrence of this class of toxin gene in a gram-positive bacterium. Consistent with this unique setting, orthologs of the HylB ATPase and HlyD "membrane fusion protein" components of an RTX toxin export system are encoded by adjacent ORFs (SSA_1100 and SSA_1101, respectively), but no homolog of the TolC outer membrane component (36) was found. Both Stx and the putative ATPase transporter component, encoded by SSA_1100, were detected in the proteomic analysis (www.sanguinis.mic.vcu.edu/supplemental.htm). Although the leukotoxin from the oral bacterium Actinobacillus actinomycetemcomitans is a well-known ortholog of the Stx protein, the products of SSA_1099 to SSA_1101 are, as a whole, most similar to proteins in plant-pathogenic pseudomonads. Thus, the origin of these S. sanguinis genes and their functions are unclear.
The genes associated with pathogenicity in S. sanguinis also include genes encoding orthologs of the major known adhesins in other viridans species. SspC and SspD are orthologs of the SspA and SspB adhesins of Streptococcus gordonii (39, 53). Whereas the latter proteins are encoded by adjacent genes in S. gordonii, this is not true in S. sanguinis. Conversely, the cshA and cshB adhesin genes are not contiguous in S. gordonii (60), whereas the S. sanguinis crpABC orthologs are contiguous. The ligand specificity of SspA orthologs in viridans streptococci is determined by their sequences (39, 53). Neither SspC nor SspD is closely related to any SspA homolog that has been characterized previously. As determined by BLASTP analysis (3), SspC has only 55% identity with its closest relative (SspA), and SspD has 33% identity with its closest relative (PaaA of Streptococcus criceti). Therefore, it is not clear what ligand(s), if any, SspC and SspD bind. However, the 27-amino-acid region of SspB that has been shown to mediate binding of S. gordonii to P. gingivalis is conserved in SspC (18 identical residues and five similar residues), including perfect identity of the critical NITVK subsequence (21). This observation suggests that SspC may also adhere to P. gingivalis.
Lipoproteins (LP) and cell-wall anchored proteins (CWA), two classes of proteins that are surface exposed and prevalent among reported virulence factors, were predicted (www.sanguinis.mic.vcu.edu/supplemental.htm). The lgt and lspA genes expected for LP processing are present (SSA_1546 and SSA_1069, respectively), as are genes encoding three sortases (SSA_0022, SSA_1219, and SSA_1631) for CWA processing. Interestingly, the numbers of these surface proteins (60 LPs and 33 CWAs) are striking compared to the numbers in related species. As determined by the same search criteria used for S. sanguinis, S. mutans has only 29 LPs and six CWAs. S. pneumoniae TIGR4 possesses 40 LPs and 12 CWAs, while R6 has 39 LPs and 13 CWAs. However, many of the additional ORFs in S. sanguinis appear to be redundant. Thus, S. sanguinis contains nine paralogous CWAs in three families and seven paralogous LPs in three families. In addition, functional redundancy may occur in the absence of overall sequence similarity; five CWAs possess the collagen-binding domain, Pfam05737 (23). This vast array of surface proteins may contribute to the ability of S. sanguinis to colonize the tooth and interact with a diverse group of oral bacteria (46) and may account for its predominance as a cause of streptococcal endocarditis (66).
Fibrils or pili are involved in streptococcal adherence and virulence (7, 59, 82). S. sanguinis strains possess both short fibrils and long fibrils (30). Fap1 of Streptococcus parasanguinis, an ortholog of the CWA encoded by SSA_0829 or SrpA, is thought to be the structural component of long fibrils (82), and its orthologs are important for adhesion to platelets (9), saliva-coated hydroxyapatite (96), and salivary agglutinin (39). The products of SSA_0830 to SSA_0841 exhibit homology to the proteins shown to be required for the glycosylation and export of SrpA orthologs in S. parasanguinis and S. gordonii (9, 17, 85). In fact, the 11 genes downstream from srpA are most similar in terms of sequence to, and are in the same order as, the 11 genes that form the export locus of the SrpA ortholog, GspB, in S. gordonii (85). Shorter fibrils in S. gordonii are comprised of CshA and possibly also CshB (59), which are orthologs of CWAs encoded by SSA_0904 to SSA_0906. The fact that S. sanguinis has both classes of proteins, as well as the locus dedicated to SrpA export, could account for the apparent presence of both short and long fibrils. In addition, in recent studies workers have identified long pili in S. agalactiae (49), S. pyogenes (63), and S. pneumoniae (7). In these bacteria, a single locus contains three putative pilin subunit genes encoding CWA motifs and one to three sortase genes that are required for assembly of the pili (7, 49, 63). S. sanguinis also contains an apparent pilus locus, with SSA_1632 to SSA_1635 encoding LPXTG proteins and SSA_1631 encoding a sortase. SSA_1632 to SSA_1634 also each contain a conserved "E box" domain found in many pilin genes (90).
The SSA_2302 to SSA_2318 sequences exhibit homology to ORFs required for production of type IV pili. Such pili were originally believed to exist only in gram-negative bacteria, although the gram-positive bacterium Ruminococcus albus appears to possess a type IV pilus that serves as an adhesin (73). Our analysis suggests that the S. sanguinis ORFs were acquired by HGT, perhaps from a clostridial species, and are distinct from the ORFs in S. sanguinis that apparently encode the pseudopilus involved in genetic competence (data not shown).
Cell wall polysaccharides (CWP) serve as important receptors for agglutination and coaggregation in oral streptococci (19, 45, 46). S. sanguinis SK36 is similar to type strain ATCC 10556 in that it coaggregates with numerous species of Streptococcus, Actinomyces, and Fusobacterium (38, 45) (Kolenbrander and Andersen, personal communication). These interactions are inhibited by addition of 60 mM N-acetyl-D-galactosamine, confirming the polysaccharide composition of the receptor (45). Six structures have been defined for CWP in oral streptococci (19), and the loci responsible for synthesis of one of these structures have been characterized in S. gordonii (97). Orthologs of these genes are located mostly in two genomic segments in S. sanguinis, SSA_1509 to SSA_1519 and SSA_2211 to SSA_2225. However, these segments also contain apparent CWP synthesis genes that have close orthologs in S. thermophilus, Streptococcus suis, S. pneumoniae, or Streptococcus iniae but no orthologs in S. gordonii. These CWP loci, therefore, appear to be unlike any loci characterized previously, and it is not clear whether they direct the synthesis of a type 1 N-acetylgalactosamine-ß1
3-galactose CWP like that found in previously characterized S. sanguinis strains (19).
Other interesting features. The S. sanguinis genome contains only two homologs of the twin-arginine translocation (Tat) system, which exports folded proteins with the characteristic N-terminal twin-arginine motif across the cytoplasmic membrane (65). SSA_1132 and SSA_1133 apparently encode the TatC Sec-independent protein translocase and the TatA Sec-independent protein secretion pathway component, respectively. Of the streptococcus genomes examined to date, this system has been found only in S. thermophilus. Our analysis showed that three genes, encoding a periplasmic lipoprotein involved in iron transport (SSA_1129), an iron-dependent peroxidase (SSA_1130), and a high-affinity Fe2+/Pb2+ permease (SSA_1131) associated with the Tat genes in S. sanguinis, are similarly associated in other genomes, including the genomes of S. thermophilus, Staphylococcus aureus MRSA252, and Staphylococcus haemolyticus. Using the TatP server (8) to search for Tat secretion substrates, we found that the iron-dependent peroxidase gene SSA_1130 was the only ORF in the genome that encoded both a consensus Tat motif and a Tat signal peptide.
Two glucosyltransferases (GTF) were found in S. sanguinis. The SSA_0613 product is a homolog of GtfR of Streptococcus oralis ATCC 10557, which synthesizes water-soluble glucans with no primer dependence (24). The SSA_1006 product is a homolog of GtfA, an enzyme that, in the presence of inorganic phosphate, converts sucrose to fructose and glucose-1-phosphate (4). Furthermore, the products of several ORFs exhibit homology to S. mutans non-GTF glucan-binding proteins (GBP), including the products of SSA_0019, SSA_0303, and SSA_0956. Non-GTF GBPs are cell surface receptors for glucan or secreted proteins that can become cell associated when glucan coats the bacterial cells. Although all GBPs have glucan-binding properties, they are a heterogeneous group of proteins with variations in size, glucan-binding domains, glucan-binding affinity, and function (4).
More than 100 putative transcriptional regulators were identified in the S. sanguinis genome (www.sanguinis.mic.vcu.edu/supplemental.htm). Like the genomes of some other streptococci, the S. sanguinis genome contains genes encoding a major sigma factor 70 (SSA_0825, rpoD) and an ortholog of the competence-specific sigma factor, ComX (SSA_0016). Genes encoding NusA (SSA_1900), NusB (SSA_0452), and NusG (SSA_2205) were found, although no obvious Rho protein was identified. This was also true for the other streptococcal genomes examined. Two genes, SSA_1187 and SSA_1695, code for additional putative antitermination proteins. Two-component regulatory systems, composed of a sensor histidine kinase and a transcriptional response regulator, provide a mechanism for bacteria to sense and respond to environmental signals. We found 29 genes that apparently comprise 14 two-component regulatory systems (www.sanguinis.mic.vcu.edu/supplemental.htm). This number is comparable to the numbers found in other streptococci (2, 26, 37, 80, 87). The "orphan" two-component response regulator encoded by SSA_1810 is an ortholog of the tissue-specific virulence factor RitR that represses the hemin-iron transport system in S. pneumoniae (92) and of the virulence factor CsrR in S. pyogenes (29), suggesting that this regulator may have a similar role in virulence in S. sanguinis.
S. sanguinis is one of the pioneer colonizers of the oral cavity and may initiate biofilm formation on tooth surfaces. Several putative biofilm-related genes are found in S. sanguinis and most other streptococci. For example, SSA_0135 to SSA_0137 are clustered in an arrangement similar to that observed for their orthologs in the adc operon, which is involved in biofilm formation in S. gordonii (52). Genes of the inducible fructose phosphotransferase operon, which is also related to biofilm formation in S. gordonii (51), are similarly clustered in S. sanguinis (SSA_1080 to SSA_1082). The SSA_1909 product is more than 60% identical to biofilm regulatory protein A (BrpA) in S. mutans. BrpA codes for a predicted surface-associated protein with functions not only in biofilm formation, autolysis, and cell division but also in the regulation of acid and oxidative stress tolerance in S. mutans (95).
SSA_1853 is an ortholog of the LuxS gene in S. oralis 34, which is responsible for the catabolism of S-ribosylhomocysteine, producing autoinducer 2, a universal signal molecule mediating cell-cell and interspecies communication (quorum sensing) among bacteria, biofilm formation, and virulence (74).
Conclusion. S. sanguinis is one of the most frequently recognized pioneering inhabitants of human oral plaque (76). Completion of its genome sequence provided unique insight into the biology, virulence, and pathogenesis of this important bacterium. The greater size and G+C content of the S. sanguinis genome reflect the differences between this organism and other streptococci. The genome has clearly been molded by HGT, and the mechanisms by which the large cluster of genes in the cob, pdu, and eut pathways were transferred and confer a selective advantage to S. sanguinis are rich subjects for future investigations. Our analysis of the genome also provided fundamental genetic data for investigating the etiology of caries by comparison with cariogenic S. mutans. The biology and metabolism of this important bacterium have been described so that new prophylactic and therapeutic strategies can now be explored. Finally, in previous studies workers have used many different strains of S. sanguinis, several of which would now be classified as S. gordonii, S. parasanguinis, or other species. The availability of the SK36 sequence, as well as the bacterium, which has been deposited in the American Type Culture Collection (catalog no. BAA-1455), should facilitate future studies with this species.
Sequence analysis was performed in the Nucleic Acids Research Facilities at Virginia Commonwealth University.
Published ahead of print on 2 February 2007. ![]()
Supplemental material for this article may be found at http://jb.asm.org/. ![]()
P.X. and J.M.A. contributed equally to this work. ![]()
Present address: Office of International Extramural Activities, Division of Extramural Activities, NIH/NIAID, Room 2155, Bethesda, MD 20892-7610. ![]()
¶ Present address: College of Biological & Environmental Engineering, Zhejiang University of Technology, 18 ChaoWang Road, Hangzhou, Zhejiang 310032, China. ![]()
|| Present address: The Institute for Genome Research, 9712 Medical Center Drive, Rockville, MD 20850. ![]()

Present address: Department of Systems and Information Engineering, University of Virginia, P.O. Box 400747, 151 Engineer's Way, Charlottesville, VA 22904. ![]()

Present address: Department of Biomedical Sciences, University of Maryland Dental School, 650 W. Baltimore Street, Baltimore, MD 21201. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»