Previous Article | Next Article ![]()
Journal of Bacteriology, September 2005, p. 6223-6230, Vol. 187, No. 17
0021-9193/05/$08.00+0 doi:10.1128/JB.187.17.6223-6230.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
Department of Infectious Disease Epidemiology, St. Mary's Hospital, Imperial College London, London W2 1PG, United Kingdom,1 Department of Microbiology, National Public Health Institute, PL 310, 90101 Oulu, Finland,2 Department of Vaccines, National Public Health Institute (KTL), Mannerheimintie 166, 00100 Helsinki, Finland3
Received 18 March 2005/ Accepted 2 June 2005
|
|
|---|
|
|
|---|
Homologous recombination distorts the true relationships between isolates of closely related named species and can lead to inconsistent relationships among those species inferred from the sequences of different genes (6, 9). Consequently, defining species using single loci is inappropriate, particularly for those species where rates of recombination are high. The use of multilocus sequence-based approaches ensures that recombination at one locus is buffered by the more reliable indications of relatedness provided by the other loci. Furthermore, in defining any species, we must analyze populations of each candidate species and not just one or a few isolates (9).
Two general types of multilocus approaches can be considered as tools to distinguish related species. Microarrays can detect differences in gene repertoire among isolates (15) but suffer the serious disadvantage that genes that differ among isolates of a named species, or between related species, reflect the least stable part of the genome, rather than the core genome, which is likely to be the most phylogenetically informative (12). The alternative approach is to use the sequences of multiple housekeeping genes that are part of the core genome (22). These data are now widely available as isolates of many pathogens are characterized by sequencing internal fragments of seven housekeeping loci, a technique referred to as multilocus sequence typing (MLST) (13).
Previously, we have shown that MLST-based approaches are capable of discriminating named species among the human Neisseria (9). In this paper, we test the utility of this approach in defining the boundaries of the named species Streptococcus pneumoniae (the pneumococcus) and the extent to which it can be resolved from closely related isolates of uncertain taxonomic status that cocolonize the human nasopharynx. The confident identification of these members of the mitis group of alpha-hemolytic streptococci is fraught with difficulty. At present, presumptive members of the species S. pneumoniae are usually identified in clinical microbiology laboratories by colonial morphology when grown on blood agar and by optochin sensitivity. In the case of optochin-insensitive isolates that otherwise appear similar to pneumococci, bile solubility may also be used. Pneumococci are further characterized by serotyping using the Quellung reaction, and isolates that can be assigned to one of the 90 recognized pneumococcal serotypes are considered unambiguously to be S. pneumoniae. After applying these tests, a number of nonserotypeable isolates that appear to be similar to pneumococci but are of uncertain taxonomic status remain.
The relationship between nontypeable pneumococcus-like isolates and genuine pneumococci has been considered by Whatmore et al. (23). Those authors point out the difficulty in resolving these issues by gene content. Genes previously thought to be limited to the pneumococcus, including those encoding the virulence factors pneumolysin (ply) and autolysin, have been found in isolates assigned to other named species such as S. oralis and S. mitis (14, 23).
This work considers the potential of MLST in defining the pneumococcus and in distinguishing it from closely related species. We used MLST to characterize a set of 121 nonserotypeable presumptive pneumococci from Finland and compared the sequences of a fragment of the ply gene with those of authentic serotypeable pneumococci. The sequence data clearly identify the nontypeable isolates as members of either the pneumococcal population or a clearly resolved and more diverse related population. We propose that these latter isolates should be recognized as distinct from pneumococci and demonstrate the utility of the sequence of a fragment of the pneumolysin gene for distinguishing nonserotypeable pneumococci from this related nonpneumococcal population.
|
|
|---|
For comparison with the pneumococcal reference set described above, 121 isolates of nonserotypeable presumptive pneumococci were obtained from the Finnish otitis media studies conducted in Finland to investigate pneumococcal disease and carriage (5, 10, 11). Details of these isolates are summarized in Table 1. Presumptive pneumococci were identified by colony morphology and sensitivity to optochin (6 µg; Biodisk PDM Diagnostic Disks, Sweden). All isolates discussed here were optochin sensitive in the first testing, but a few isolates were optochin resistant in later testing, as indicated in Table 1. All isolates were, on first inspection, not serotypeable. In some cases, however, subsequent testing revealed reactions with omniserum, and in these cases, serotype was determined by the Quellung reaction. Isolates were obtained from either middle ear fluid (MEF) of children with acute otitis media (AOM) (11) or nasopharyngeal (NP) swabs of healthy children or children with AOM (20). Two isolates were excluded during analysis (IOPR 4609 and IOPR 3386), as they failed to yield good-quality sequence at any MLST locus despite repeated attempts.
|
View this table: [in a new window] |
TABLE 1. Allelic profiles and STs of strains used in this work
|
Phylogenetics and population genetics. To illustrate differences between individual alleles at the MLST and ply loci, minimum evolution trees were constructed using all nucleotide differences and the Kimura 2 parameter distance correction in MEGA 2.1. The sequences of all loci except ddl (see below) were concatenated, maintaining the +1 reading frame, and trees were constructed from the concatenated 2,751-bp sequence using MrBayes 3.0b4 (16). A starting neighbor-joining tree was determined in PAUP*4.0beta v.10 (http://paup.csit.fsu.edu/) (19), with distances corrected using the HKY85 model. This was input as a starting tree into MrBayes, four Markov chain Monte Carlo chains were run with default heating parameters until convergence, and 10,000 trees were sampled from the posterior probability distribution. These were then used to produce a consensus tree. The choice of evolutionary model for MrBayes was made using MrModeltest 2.2 (http://www.ebc.uu.se/systzoo/staff/nylander.html) and corresponded to the general reversible model with rates of substitution being gamma distributed between sites, a proportion of which were invariant. Nucleotide diversities were estimated using DNAsp (17). Other population genetic analyses were performed using Arlequin v2.0 (http://lgb.unige.ch/arlequin/).
|
|
|---|
![]() View larger version (19K): [in a new window] |
FIG. 1. Phylogenetic tree from concatenated sequences of MLST loci excepting ddl. The tree shows the relationships between the pneumococcal reference set (indicated by open circles) and nontypeable isolates (closed circles). Nontypeable isolates that had the same ST as members of the reference set are shown in gray. Group 1 contains the reference set of pneumococci and all nontypeable strains clustering with them. Group 1b is defined in the text. Group 2 contains the remaining nontypeable isolates, indicated with the prefix NT in Table 1. Trees were generated using MrBayes as described in Materials and Methods, using all nucleotide sites. The scale bar indicates substitutions per site. The asterisk marks the node that is considered to separate authentic pneumococci from group 2 isolates.
|
Relationships of alleles in nontypeable isolates to those found in the MLST database. The nontypeable isolates that clustered within group 1 were considered to be authentic pneumococci that failed to express a capsule, and any novel alleles in these isolates were assigned allele numbers and added to the pneumococcal MLST database. For those nontypeable isolates that did not fall into group 1, each unique sequence was given an alphabetic allele identifier and retained in a separate database to prevent confusion with the pneumococcal alleles in the MLST database. Minimum evolution trees were constructed for each locus from the sequences of all known pneumococcal alleles in the MLST database together with the alleles from all nontypeable isolates (Fig. 2). In the case of the aroE locus, the sequences cluster into two groups that are highly divergent from one another but with relatively little diversity within each group. The mean diversity of aroE alleles in the MLST database is 2.4 nucleotide differences, in comparison to a mean of 4.1 nucleotide differences within the aroE alleles identified in group 2 strains (significantly greater [Student's t test; P = 0.006]). This is the situation for all other loci with the exception of ddl, with nucleotide diversity in group 2 strains being significantly greater among the alphabetic alleles (Student's t test; P << 0.05 for each) than those from typical pneumococci, as is apparent from the trees shown in Fig. 2, and diverging by >5% from the latter group of alleles. However, certain alleles present in serotypeable pneumococci in the MLST database clearly cluster with the alphabetic alleles (e.g., recP allele 26) and almost certainly represent instances of lateral transfer between the groups (presumably importation of alleles into pneumococci). The bidirectional nature of this is evident in Table 1 through the finding of alleles which are present, indeed common, in authentic pneumococci among the nontypeable isolates that cluster outside group 1.
![]() View larger version (33K): [in a new window] |
FIG.2. Minimum evolution trees for the seven MLST loci. Trees were constructed using the sequences of all alleles from the pneumococcal MLST database and those from the nontypeable isolates. The latter are indicated by red markers. (a) aroE; (b) gdh; (c) gki; (d) recP; (e) spi; (f) xpt; (g) ddl. Trees were generated using MEGA 2.1. All nucleotide differences were used in the analysis, and distances were corrected using the K2P model. The scale bar indicates substitutions per site.
|
Pneumolysin gene sequence. The ply gene, once considered to be a defining property of the pneumococcus, has recently been demonstrated to be present in isolates of related species (14, 23). We therefore sequenced a 282-bp fragment of the ply locus from all nontypeable isolates and from the pneumococcal reference set. The ply gene was found to be present in all but one isolate, which was highly divergent at the MLST loci (IOPR 2148 [Table 1]). Another isolate, IOPR 2640 (NT34), appeared to contain more than one ply sequence; sequencing of the amplified fragment from several DNA preparations from purified single colonies gave a mixed sequence, although this was not observed with the MLST loci. Interestingly, some of these were typical pneumococcal alleles, while others were divergent, suggesting that this strain has a history of interspecific recombination. Each unique ply sequence was assigned as a different allele following the same conventions described above for MLST genes (i.e., integers for alleles of isolates in group 1 and alphabetic identifiers for the remainder). A minimum evolution tree showing the relationships between the ply sequences is presented in Fig. 3, and the ply alleles assigned to individual isolates are shown in Table 1. Alleles from the reference pneumococcal set again form a distinct cluster, and all of the nontypeable isolates that fall into group 1 in Fig. 1 had ply sequences that either clustered with or were identical to the ply alleles from the pneumococcal reference set. The other nontypeable isolates had ply alleles that were distinct from those of the reference pneumococci and which clustered apart from them on the tree.
![]() View larger version (12K): [in a new window] |
FIG. 3. Minimum evolution tree for the ply alleles. All alleles found in the pneumococcal reference set or other group 1 isolates (shown in boldface) were found to descend from a single node. The remainder were alleles from nontypeable isolates that clustered apart from group 1 in Fig. 1. The tree was generated using MEGA 2.1. All nucleotide differences were used in the analysis, and distances were corrected using the K2P model. The scale bar indicates substitutions per site.
|
|
|
|---|
Single genes are clearly unsatisfactory for exploring these issues, and we have therefore used a multilocus approach. We have previously applied this approach to the human Neisseria species and found it to be capable of discriminating named species even in the presence of recombination (9). Here, we consider the related but distinct question of whether we can define the boundaries of a named species and distinguish it from other related species which may not currently be designated as such. Nontypeable presumptive pneumococci provide a useful source of isolates that are closely related to pneumococci and which are known to include authentic pneumococci that for various reasons may not express a capsule, in addition to isolates that are genetically distinct. Trees based on the concatenated sequences of the MLST loci (excluding ddl) clearly resolved the nontypeable presumptive pneumococci into two groups. Approximately 58% of the isolates clustered with serotypeable pneumococci, whereas the others were a separate and more diverse population. Based on these results, we propose that S. pneumoniae may be defined as all isolates falling into group 1 on a tree based on the concatenated sequences of the six MLST loci. The pneumococcal MLST website now contains a function that allows users to test whether isolates sequenced in their laboratory should be considered true pneumococci under this definition (1).
The presence of the pneumolysin gene in isolates closely related to pneumococci has been previously demonstrated (14, 23) and precludes using the presence of this gene as a means of distinguishing pneumococci from their closest relatives. However, the ply alleles in pneumococci were different from those in isolates that grouped outside the pneumococci in the tree constructed using concatenated MLST sequences. Precisely the same groups were obtained using the sequence of the pneumolysin gene fragment or the concatenated MLST loci. The ply sequences therefore provide a further means of identifying true pneumococci in difficult cases, although rigorous assignment of a nontypeable isolate as a pneumococcus should involve examination of the clustering obtained with both the concatenated MLST loci and the ply gene fragment.
Those isolates not identified as pneumococci by this approach are a much more highly diverse grouping than the pneumococci, as demonstrated by comparing the mean genetic distance within the two groups. This is even more striking if you consider that group 1 contains strains from a reference set specifically chosen to illustrate the diversity of the pneumococcal population, whereas the atypical isolates reported here were retrieved in longitudinal studies of carriage and AOM within a limited geographic area and are therefore unlikely to represent the full diversity of this population. The relationship of these organisms to the recently proposed species S. pseudopneumoniae (2) remains to be determined. It should be noted that two of the strains falling outside group 1 (IOKOR 731 and IOKOR 1362) were isolated from MEF in children suffering from AOM, suggesting that organisms of this group may harbor pathogenic potential in some disease contexts.
The approach described here clearly delineates the boundaries of the pneumococcal cluster in sequence space and, thanks to its multilocus nature, is resistant to limited shuffling of genetic information across this boundary. Unlike previous attempts to define bacterial species using conceptually similar approaches (22), we tested the ability of our method to discriminate between the pneumococcus as a population and a large group of very closely related isolates that are indistinguishable by other methods. We are also impressed by the ability of this approach to resolve the pneumococcus with confidence, even in the presence of relatively high levels of recombination. However, it is likely that the pneumococcus is an example of a "fuzzy species," and further sampling of the fringes of the pneumococcal cluster may require us to update our definitions. This issue can only be resolved by further work. It should be noted that recombination means that these trees contain no useful information about the relationships within group 1 and group 2, but this is insufficient to obscure the differences between them. It remains to be seen if it is possible to define combinations of phenotypic characteristics shared by all members of group 1 that are not found in any members of the diverse group of related organisms this work has shown clustering apart from genuine pneumococci. The recombinogenic nature of these organisms may mean that attempts to do so are misguided.
This work raises several questions about the nature of the mechanism which generates and maintains these divisions. One possibility is effective reproductive isolation, in which strains mainly undergo recombination with isolates of the same named species. While interspecific recombination does occur, and renders a single-locus approach untenable, it is not common enough to prevent the emergence of those clusters in sequence space we refer to as species. In the case of clonal bacteria, the virtual absence of recombination will necessarily lead to clusters of related strains. What we do not know is what generates such effective isolation. We are also ignorant of to what degree recombination must be limited in order to achieve effective isolation and consequent speciation. To resolve these issues, further studies that combine theoretical and molecular approaches are required.
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»