Previous Article | Next Article ![]()
Journal of Bacteriology, December 2009, p. 7157-7164, Vol. 191, No. 23
0021-9193/09/$08.00+0 doi:10.1128/JB.00838-09
Copyright © 2009, American Society for Microbiology. All Rights Reserved.
,
,
Department of Bioengineering and Bioinformatics, Moscow State University, Vorob'evy gory 1-73, Moscow 119992, Russia,1 Institute for Information Transmission Problems, RAS, Bolshoi Karetny Pereulok 19, Moscow 127994, Russia,2 Stowers Institute for Medical Research, 1000 E. 50th St., Kansas City, Missouri 64110,3 Department of Microbiology, Molecular Genetics, and Immunology, University of Kansas Medical Center, Kansas City, Kansas 661604
Received 25 June 2009/ Accepted 18 September 2009
|
|
|---|
-helical domains to give EutB and EutC, respectively. This fusion was followed by recruitment and occasional loss of auxiliary ethanolamine utilization genes in Firmicutes and by several horizontal transfers, most notably from the firmicute stem to the Enterobacteriaceae and from Alphaproteobacteria to Actinobacteria. We identified a conserved DNA motif that likely represents the EutR-binding site and is shared by the ethanolamine and cobalamin operons in several enterobacterial species, suggesting a mechanism for coupling the biosyntheses of apoenzyme and cofactor in these species. Finally, we found that the food poisoning phenotype is associated with the structural components of metabolosome more strongly than with ethanolamine utilization genes or with paralogous propanediol utilization genes per se. |
|
|---|
In S. Typhimurium, all these genes are part of the eut operon (15), along with the transcriptional regulator (eutR) and the genes that encode the structural components of metabolosome, a bacterial microcompartment thought to play a role in preventing the escape of gaseous aldehyde but not strictly required for ethanolamine cleavage (7). In addition to these proteins, the eut operons of the Enterobacteriaceae and Firmicutes typically encode several other proteins, including EutP, EutQ, and EutJ. The orthologs of EutBC are sporadically distributed in different lineages of bacteria, in particular in Proteobacteria and Firmicutes, and are not found in archaea or eukaryotes (24). The genes carried by the eut operon and their molecular functions are summarized in Table S1 in the supplemental material.
The evolutionary origin of the ethanolamine utilization system is unclear. The structural proteins of the metabolosome complex are paralogous to the shell proteins of carboxysome (15), an organelle that concentrates CO2 for fixation by ribulose-bisphosphate carboxylase in cyanobacteria and sulfur-oxidizing bacteria (21), but the phylogeny of these shell components remains to be investigated in detail. The amino acid sequences of the main enzyme of the ethanolamine utilization pathway, EutBC, retrieve only closely related, orthologous proteins in database searches, and there is no plausible evolutionary scenario explaining the current phylogenetic distribution of ethanolamine lyase.
The regulation of ethanolamine lyase is most extensively studied for S. Typhimurium, where cobalamin and ethanolamine are both required for the full expression of the eut operon, which is transcriptionally activated by the positive regulator EutR, encoded by the operon itself (15). There is no information on the molecular determinants of the activation of the eut operon by EutR, and the understanding of the coordination of apoenzyme and coenzyme biosynthesis is incomplete. A better-studied paralogous pdu operon of propanediol utilization in S. Typhimurium is controlled by the positive regulator PocR, which, like EutR, belongs to the AraC sequence family. PocR also positively controls the cobalamin synthesis pathway by direct transcriptional activation of the cob operon (6, 25).
The ethanolamine utilization pathway is of practical concern, as it is present in many human and animal pathogens linked to food poisoning. A probabilistic search of the database of phyletic vectors by using the food poisoning phenotype identifies five genes of ethanolamine utilization as near-perfect genotype-to-phenotype matches (18). A more complex machine-learning approach, which examines genome context and cooccurrence of scientific terms in the literature, has connected food poisoning with both ethanolamine and 1,2-propanediol utilization pathways (16). Here, again, however, the molecular basis of the observed biological phenomenon is not known.
In this work, we employed the complete genome sequences of several hundred bacterial species and used computational approaches to answer questions about the regulation of eut genes, the evolution of eut operons, the structure of the crucial EutBC proteins, and the connection between diol utilization pathways and food poisoning.
|
|
|---|
For phylogenetic inference, the maximum likelihood method with the Jones-Taylor-Thornton model implemented in the Proml program of the PHYLIP package (11) or Bayesian estimation of phylogeny implemented in MRBAYES 3.0 (26) with the fixed-rate Poisson model with unconstrained topology was used. Ancestral states were reconstructed using the parsimony model implemented in the Mesquite suite (19). Regulatory regions were aligned using CLUSTAL_X (35) and MACAW (30), the profiles were built from the most-conserved blocks by using the GenomeExplorer program (20), and genome scans were performed, with the threshold set at the lowest score observed in the training set (13). Sequence logos were produced by the Weblogo program (10), and phylogenetic trees were drawn using the iTOL server (17). The enrichment statistics was derived using the standard hypergeometric distribution formula implemented in the R package (14). The P value was calculated based on that distribution function, using the Phyper function in R.
|
|
|---|
In Actinobacteria and in most Proteobacteria, the eut operon consists only of eutBC and usually the transporter eat. Some Proteobacteria additionally contain an ortholog of the transcription regulator eutR at a different genomic location but no apparent orthologs of other eut genes (Fig. 1). On the other end of the spectrum, there is "the long operon" found in Enterobacteriaceae, Firmicutes, Nocardioides sp., and Fusobacterium nucleatum. This is an arrangement of up to 17 genes, which may also include some of the putative propanediol utilization genes and the duplicates of the metabolosome genes. In Enterobacteriaceae, some genes of such a complete set may be missing: for example, Shigella sonnei Ss046 has no eutS, -P, -Q, and -T genes; Shigella boydii Sb227 lacks eutG, -H, and -A; and Shigella dysenteriae Sd197 has only eutS, -C, -L, -K, and -R and truncated eutB.
![]() View larger version (30K): [in a new window] |
FIG. 1. Diversity of eutBC genome contexts. Short operons: A, Deltaproteobacteria; B, a subset of Proteobacteria, Chlorophlexi, and Bacteroidetes; C, selected Proteobacteria and Acidobacteria; and D, Betaproteobacteria (eutR is in a different genomic location than eutBC and eat). Long operons: E, Nocardioides sp.; F, Enterobacteriaceae; G, M. aquaeolei; H, S. boydii Sb227; I, S. sonnei Ss046; J, S. dysenteriae Sd197; K, Symbiobacterium thermophilum and P. luminescens; L, P. fluorescens Pf-5; M, Clostridiaceae and F. nucleatum; N, Listeriaceae and Enterococcaceae; and O, C. acetobutylicum. A probable eutB pseudogene is shown in white. The predicted EutR-binding sites are indicated by purple ellipses. See Tables S1 and S2 in the supplemental material for complete lists of gene names and species.
|
![]() View larger version (55K): [in a new window] |
FIG. 2. Maximum likelihood evolutionary tree of EutB. The bootstrap support of tree partitions is indicated by branch color: green, >70%; blue, 50 to 70%; and red, <50%. The outer color stripe mark genes from the short operon in red and genes from the long operon in green. The inner color circle marks bacterial clades: red, Alphaproteobacteria; orange, Betaproteobacteria; green, Gammaproteobacteria; dark blue, Deltaproteobacteria; dark purple, Firmicutes; pink, Actinobacteria; lime, Acidobacteria; light purple, Fusobacteria; light blue, Chlorophlexi; and light green, Bacteroidetes. The shaded background marks two clades that do not agree with established bacterial phylogeny and suggest horizontal gene transfer events (see text).
|
EutB and EutC are paralogous to diol dehydratases/lyases. The sequences of EutB and EutC are well conserved in evolution, and they are not clearly similar to those of other proteins in typical database searches. Even iterative scans of the protein databases by use of the PSI-BLAST program (2) detect only the orthologs of each protein. The large EutB subunit of ethanolamine-ammonia lyase has been predicted to adopt the eight-stranded TIM barrel-like fold, similar to what was found for other cobalamin-dependent dehydratases with the known structure, such as propanediol dehydratase and glycerol dehydratase (protein data bank [PDB] accession no. 1eex, 1dio, and 1mmf) (32). The interaction with the cofactor must occur predominantly in the "bottom" portion of the barrel, corresponding to the C termini of the predicted beta strands forming the inner barrel surface (33). Recently, the three-dimensional structure of a hexamer of the Listeria monocytogenes EutB protein was determined (PDB accession no. 2qez), confirming these earlier predictions. In addition to the alpha/beta TIM barrel domain, however, a smaller alpha-helical N-terminal domain was found, consisting of residues 1 to 140, wrapped around the external surface of the TIM barrel and making contacts with the adjoining monomers. The homologous large subunits of other cobalamin-dependent enzymes also have the additional N-terminal sequence regions, which are missing from the available crystal structures.
Using sensitive comparison of probabilistic models of protein families with the HHsearch program (31), we found that the N-terminal alpha-helical domains of cobalamin-dependent lyase large subunits are homologous: for example, the first 140 aligned positions of the dehydratase large subunits specified by the Hidden Markov Model automatically built by the HHsearch program from the sequence of propanediol hydratase match the N-terminal region of the EutB family model with a probability (P) value of 1.7·10–4. Interestingly, the residues most conserved between different lyases are not the same as the ones involved in the interactions of EutB within the hexamer (see Fig. S1 in the supplemental material), suggesting either that these interactions in the crystallized form are not representative of the EutBC complex in vivo or that the N-terminal domains of the lyase large subunits play roles in addition to homooligomerization.
The three-dimensional organization of the small ethanolamine-ammonia lyase subunit (EutC) remains unknown. Prediction of the secondary and tertiary structures of EutC suggests an alpha/beta structure in this protein and a borderline structural similarity to NADP-dependent methylenetetrahydrofolate dehydrogenase from M. tuberculosis (PDB accession no. 2c2x), covering more than 60% of residues in both molecules (see Fig. S1 in the supplemental material). This is compatible with the Rossmann-like alpha/beta fold in EutC, which is also the fold adopted by the beta subunit of propanediol dehydratase (37). Interestingly, in an analogy with the large subunit, EutC is also predicted to have a small N-terminal alpha-helical domain, and this domain, or at least its longest helix, is clearly conserved in the propanediol dehydratase beta subunit. Moreover, specific sequence similarity to this region can also be detected in the N termini of three other proteins involved in the same pathways but having completely different structures, namely, in the all-helical gamma subunit of propanediol dehydratase (which actually gives better alignment to EutC than the apparently homologous beta subunit); in the beta-barrel protein EutQ, a member of the cupin superfamily; and in the phosphotransacylase PduL. Some of these similarities span relatively short numbers of residues, e.g., only 36 in the case of EutC-EutQ alignment, but nonetheless are specifically recovered with HHsearch (E value below 10–4) (see Fig. S1 in the supplemental material). Only the 17-residue EutC-PduL match is reported without statistical support.
EutB and EutC phylogeny. To elucidate the evolutionary history of the core eut pathway, we aligned sequences of EutB and EutC and inferred the phylogenies of these proteins. The results were largely in agreement for both subunits and for all algorithms of phylogenetic inference. In particular, the subdivisions of Proteobacteria (Alphaproteobacteria, Betaproteobacteria, Deltaproteobacteria, and Gammaproteobacteria) form distinct clades that are clustered together in all trees. One exception is Enterobacteriaceae, which appear as a long branch distant from other Gammaproteobacteria. The other is actinobacteria branches, which are nested within Proteobacteria and, more specifically, within Alphaproteobacteria, whereas most of Firmicutes form a sister clade with the Enterobacteriaceae (Fig. 2). Sequences from Acidobacteria and Chlorophlexi are typically found within Proteobacteria, and the Fusobacteria sequences are clustered with Firmicutes. Inclusion of an outgroup (assuming that the subunits of 1,2-propanediol lyase are paralogous to EutBC) (see Discussion) suggests the root position on the long branch leading to Proteobacteria, though a relatively low level of statistical support on the deep branches makes this assignment tentative.
Two major subtrees, one including Firmicutes and the Enterobacteriaceae and the other including the rest of the Proteobacteria with the nested Actinomycetales, correspond to the main two types of eutBC contexts that were discussed in the previous section, i.e., the long and short versions, respectively (Fig. 2). Even in the species with two eut operons (K. pneumoniae, M. aquoeolei, and P. fluorescens), the two copies of EutB (and EutC) cluster in the trees not with each other but with the orthologs from the species that share long- or short-operon context.
Comparative genomics of the eutBC regulation. The eut operons are controlled by at least two types of regulatory systems: in most Enterobacteriaceae and in some Betaproteobacteria (including Polaromonas naphthalenivorans CJ2, Methylibium petroleiphilum PM1, and all sequenced Burkholderiales), the operon is regulated by EutR, while Firmicutes and F. nucleatum have a two-component regulatory system adjacent to the eut genes (for example, EutV and EutW in Enterococcus faecalis [12]). Interestingly, this two-component system also appears to have a high rate of coinheritance with the cobalamin biosynthesis genes, again pointing in the same functional direction (18). Actinobacteria and a subset of the Proteobacteria lack orthologs of these genes, so these bacterial groups may possess yet other systems of the eut operon regulation.
We analyzed sequence conservation in the upstream regions of the eut operons in two groups of EutR-containing genomes that include several diverse species, i.e., Enterobacteriaceae with 6 species and Betaproteobacteria with 17 species. Multiple sequence alignment of the putative regulatory regions in the first group revealed two conserved sequences (see Fig. S2 in the supplemental material). In S. Typhimurium, the global transcription factor Crp is known to control the paralogous pdu operon and may be involved in eut regulation as well (1), and the first region that we discovered, wwwTGTGAtyyrgwTCACTtWt, which is similar to the canonical Crp-binding motif (4), may indeed play a role in recognizing Crp. The other conserved region in the Enterobacteriaceae did not match any known regulatory sites. However, it was closely similar to the separately defined nucleotide motif in the regions located upstream of the eut operons in Betaproteobacteria, which was the only conserved element in the latter group of species (Fig. 3). We constructed positional weight matrices of this motif and scanned the intergenic regions of the various bacterial species with this model. In Betaproteobacteria, there were no significant similarities other than self-matches in the eut regulatory region, and we did not find this or any other conserved DNA motifs upstream of the eut operons in Firmicutes or any other bacterial groups. Interestingly, in Enterobacteriaceae, the next-best match after the self-matches was the intergenic region preceding cbiA, the 5'-proximal gene in the cob operon required for de novo cobalamin biosynthesis (see Table S3 in the supplemental material).
![]() View larger version (50K): [in a new window] |
FIG. 3. Conserved elements in proteobacteria that may bind EutR. (A) Conserved element preceding the eut operon in Betaproteobacteria (for site scores and locations, see Table S3 in the supplemental material); (B) conserved binding element preceding the eut operon in Enterobacteriaceae; (C) conserved element upstream of the cbiA gene in Enterobacteriaceae.
|
|
View this table: [in a new window] |
TABLE 1. Strength of links between eut and pdu genes and food poisoning
|
|
|
|---|
The EutB and EutC proteins of the Enterobacteriaceae cluster with the orthologs from Firmicutes and not with those from other Proteobacteria. Both Enterobacteriaceae and Firmicutes are the tips of long branches in our trees, and it is possible that their adjoining positions are due to the long branch attraction artifacts (5). We feel, however, that two other factors may contribute to this tree topology: first, the evolution of the EutBC enzymes in the long operons in Enterobacteriaceae and Firmicutes may be constrained in similar ways by the interaction with the metabolosome, and second, there may have been another act of horizontal gene transfer in the early evolution of these operons.
In order to understand the ancient evolutionary events better, we attempted to reconstruct the ancestral states of the eut operon using a simple parsimony model of gene gain and loss implemented in the Mesquite software package (19). Under this model, the ancestral proteobacterium is inferred to have contained eutBC and eat genes. Other components of the pathway appear in the branch leading to the Enterobacteriaceae, as does the transcriptional regulator eutR (also present in some Betaproteobacteria, to which it might have been transferred from the Enterobacteriaceae). The ancestral operon in Firmicutes likely included eutABC and the two-component regulatory system, but the ancestral state of other eut genes in this lineage cannot be determined unambiguously given the current data.
A conservative estimate of about five genes in an ancestral firmicute, together with an even smaller set of genes in the ancestral proteobacterium, may suggest the following tentative evolutionary scenario. The earliest version of the eut operon may have emerged by cooptation of a TIM barrel and a Rossmann fold, by adornment of them with additional N-terminal helical domains, and by recruitment of a permease gene for transportation of ethanolamine from the environment (perhaps of the eat type, which seems to be spread more widely in the extant species and more closely associated with the short operons than eutH). The small set of genes was supplemented by a recycling factor and a two-component regulatory system in Firmicutes, which may have replaced the Eat transporter with EutH (though the eat gene is retained in Clostridium acetobutylicum). More-recent evolution of the operon included acquisition of genes encoding the structural components of the metabolosome and accrual of other eut genes. Gains and losses of auxiliary ethanolamine degradation genes resulted in the diversity of gene contexts of eutBC.
Enterobacteriaceae may have acquired a partially formed eut operon with metabolosome shell genes from Firmicutes. Such direction of horizontal transfer is more plausible than the opposite one, given that the Enterobacteriaceae is a younger evolutionary lineage than Firmicutes and also that some deep-branching Proteobacteria, such as Photorhabdus luminescens, do not include eut genes.
The scenario outlined above assumes three horizontal gene/operon transfer events, i.e., a transfer of a long eut operon from an ancestral firmicute to the Enterobacteriaceae, a transfer of a short operon from an alphaproteobacterium to the Actinobacteria, and acquisition of the metabolosome shell genes by an ancestral firmicute, probably from a cyanobacterium that had a carboxysome. Evolutionary histories with fewer horizontal transfers can also be proposed, yet those typically include massive operon losses or unlikely evolutionary events, such as parallel accrual of similar sets of orthologous genes in long operons. On balance, we feel that our hypothesis of gradual buildup of the eut operon within Firmicutes and its transfer to the Enterobacteriaceae with another transfer from the Alphaproteobacteria to the Actinobacteria is best compatible with the available biochemical and genomic evidence.
The EutR regulator in S. Typhimurium responds to two effectors, cobalamin and ethanolamine. In the absence of one or both effectors, there is a weak basal constitutive expression of eutR from the PII promoter. Elevated concentration of the effectors induces eut operon activation by EutR through the PI promoter (see Fig. S3 in the supplemental material). EutR is hypothesized to sense cobalamin and ethanolamine directly (29), but the molecular basis for this recognition is not known. We found a conserved sequence in the upstream regions of the EutR gene-containing operons in Enterobacteriaceae and Betaproteobacteria, which is also present upstream of the cobalamin biosynthesis operon in Enterobacteriaceae. It is plausible that Enterobacteriaceae uses this control element to coordinate production of the EutBC apoenzyme and simultaneous synthesis or import of its cobalamin cofactor. Such coregulation may be achieved if EutR indeed serves as a sensor of both compounds and if its activated form upregulates the expression of both eut and cbi operons. On the other hand, cobalamin is required for the activity of several enzymes in addition to EutBC, and therefore, negative regulation of eut operon by depletion of EutR has to be decoupled from cbi operon regulation. This may be achieved via positive regulation of vitamin B12 production by multiple inputs (i.e., at least PocR in addition to EutR) and negative regulation by the B12-responsive riboswitch (22, 36).
Analysis of links between distribution of individual genes and the food poisoning phenotype suggests that the pathogenic phenotype may be related to the presence of some reaction intermediate, such as perhaps highly active cobalamin-derived radical species produced in the course of catalysis (3), or even some spurious compound, when it is driven to local high concentrations in the metabolosome. On the other hand, the core component of the ethanolamine utilization reaction, the EutBC enzyme, as well as ethanolamine transporters, appears to be relatively benign.
This study was supported by the Stowers Institute and by grants from the Howard Hughes Medical Institute to M. S. Gelfand (55005610), the Russian Academy of Science (Molecular and Cellular Biology program), and the Russian Foundation for Basic Research (08-04-01000-a).
O.T. and A.M. conceived the study, and all authors analyzed the data, wrote the manuscript, and approved its final form.
Published ahead of print on 25 September 2009. ![]()
Supplemental material for this article may be found at http://jb.asm.org/. ![]()
The authors have paid a fee to allow immediate free access to this article. ![]()
|
|
|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»