| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Allegheny General Hospital, Allegheny-Singer Research Institute, Center for Genomic Sciences, Pittsburgh, Pennsylvania 15212, USA; Children's Hospital of Pittsburgh, Pittsburgh, Pennsylvania, 15213, USA; National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD 20894, USA; The Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK; Department of Microbiology and Immunology, Drexel University College of Medicine, Allegheny Campus, Pittsburgh, PA 15212
* To whom correspondence should be addressed. Email: fhu{at}wpahs.org.
| Abstract |
|---|
The distributed genome hypothesis (DGH) states that pathogenic bacteria possess a supragenome that is much larger than the genome of any single bacterium, and that these pathogens utilize genetic recombination and a large, non-core set of genes as a means of diversity generation. We sequenced the genomes of eight nasopharyngeal strains of Streptococcus pneumoniae isolated from pediatric patients with upper respiratory symptoms, and performed quantitative genomic analyses among them and nine publicly available pneumococcal strains. Coding sequences from all strains were grouped into 3170 orthologous gene clusters of which 1454 (46%) were conserved among all 17 strains. The majority, 1716 (54%), of the gene clusters was not found in all strains. Genic differences per strain pair ranged from 35 to 629 orthologous clusters, with each strain's genome containing between 21-32 % of non-core genes. The distribution of the orthologous clusters per genome for the 17 strains was entered into the Finite Supragenome Model, which predicted that: 1) the S. pneumoniae supragenome contains over 5000 orthologous clusters; and 2) 99% of the orthologous clusters (
3000) that are represented in the S. pneumoniae population at frequencies
0.1% can be identified if 33 representative genomes are sequenced. These extensive genic diversity data support the DGH and provide a basis for understanding the great differences in clinical phenotype associated with various pneumococcal strains. Taken together with previous studies that demonstrated the presence of a supragenome for Streptococcus agalactiae and Haemophilus influenzae it appears that the possession of a distributed genome is a common host-interaction strategy.
This article has been cited by other articles:
| Appl. Environ. Microbiol. | Infect. Immun. | Eukaryot. Cell |
|---|---|---|
| Mol. Cell. Biol. | J. Virol. | Microbiol. Mol. Biol. Rev. |
| ALL ASM JOURNALS |