This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrowReprints and Permissions
Right arrow Copyright Information
Right arrow Books from ASM Press
Right arrow MicrobeWorld
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Labedan, B.
Right arrow Articles by Riley, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Labedan, B.
Right arrow Articles by Riley, M.

 Previous Article  |  Next Article 

J. Bacteriol., 03 1995, 1585-1588, Vol 177, No. 6
Copyright © 1995, American Society for Microbiology

Widespread protein sequence similarities: origins of Escherichia coli genes

B Labedan and M Riley
Institut de Genetique et Microbiologie, Universite de Paris-Sud, Orsay, France.

To learn more about the evolutionary origins of Escherichia coli genes, we surveyed systematically for extended sequence similarities among the 1,264 amino acid sequences encoded by chromosomal genes of E. coli K-12 in SwissProt release 26 by using the FASTA program and imposing the following criteria: (i) alignment of segments at least 100 amino acids long and (ii) at least 20% amino acid identity. Altogether, 624 extended alignments meeting the two criteria were identified, corresponding to 577 protein sequences (45.6% of the 1,264 E. coli protein sequences) that had an extended alignment with at least one other E. coli protein sequence. To exclude alignments of questionable biological significance, we imposed a high threshold on the number of gaps allowed in each of the 624 extended alignments, giving us a subset of 464 proteins. The population of 464 alignments has the following characteristics expressed as median values of the group: 254 amino acids in the alignment, representing 86% of the length of the protein, 33% of the amino acids in the alignment being identical, and 1.1 gaps introduced per 100 amino acids of alignment. Where functions are known, nearly all pairs consist of functionally related proteins. This implies that the sequence similarity we detected has biological meaning and did not arise by chance. That a major fraction of E. coli proteins form extended alignments strongly suggests the predominance of duplication and divergence of ancestral genes in the evolution of E. coli genes. The range of degrees of similarity shows that some genes originated more recently than others.(ABSTRACT TRUNCATED AT 250 WORDS)


This article has been cited by other articles:

  • Innes, D., Beacham, I. R., Beven, C.-A., Douglas, M., Laird, M. W., Joly, J. C., Burns, D. M. (2001). The cryptic ushA gene (ushAc) in natural isolates of Salmonella enterica (serotype Typhimurium) has been inactivated by a single missense mutation. Microbiology 147: 1887-1896 [Abstract] [Full Text]  
  • Nakatsu, C. H., Korona, R., Lenski, R. E., de Bruijn, F. J., Marsh, T. L., Forney, L. J. (1998). Parallel and Divergent Genotypic Evolution in Experimental Populations of Ralstonia sp.. J. Bacteriol. 180: 4325-4331 [Abstract] [Full Text]  
  • Jordan, I. K., Makarova, K. S., Spouge, J. L., Wolf, Y. I., Koonin, E. V. (2001). Lineage-Specific Gene Expansions in Bacterial and Archaeal Genomes. Genome Res 11: 555-565 [Abstract] [Full Text]