Previous Article | Next Article ![]()
Journal of Bacteriology, December 2005, p. 8370-8374, Vol. 187, No. 24
0021-9193/05/$08.00+0 doi:10.1128/JB.187.24.8370-8374.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
Michael Krasnitz,1*,
Hagar Barak,2 and
Arnold J. Levine1
Institute for Advanced Study, Natural Sciences, Einstein Drive, Princeton, New Jersey 08540,1 Molecular Biology Department, Princeton University, Washington Road, Princeton, New Jersey 085442
Received 15 August 2005/ Accepted 3 October 2005
The degeneracy of codons allows a multitude of possible sequences to code for the same protein. Hidden within the particular choice of sequence for each organism are over 100 previously undiscovered biologically significant, short oligonucleotides (length, 2 to 7 nucleotides). We present an information-theoretic algorithm that finds these novel signals. Applying this algorithm to the 209 sequenced bacterial genomes in the NCBI database, we determine a set of oligonucleotides for each bacterium which uniquely characterizes the organism. Some of these signals have known biological functions, like restriction enzyme binding sites, but most are new. An accompanying scoring algorithm is introduced that accurately (92%) places sequences of 100 kb with their correct species among the choice of hundreds. This algorithm also does far better than previous methods at relating phage genomes to their bacterial hosts, suggesting that the lists of oligonucleotides are "genomic fingerprints" that encode information about the effects of the cellular environment on DNA sequence. Our approach provides a novel basis for phylogeny and is potentially ideally suited for classifying the short DNA fragments obtained by environmental shotgun sequencing. The methods developed here can be readily extended to other problems in bioinformatics.
H.R. and M.K. contributed equally to this work.
This article has been cited by other articles:
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»