Previous Article | Next Article ![]()
Journal of Bacteriology, June 2007, p. 4520-4528, Vol. 189, No. 12
0021-9193/07/$08.00+0 doi:10.1128/JB.00277-07
Copyright © 2007, American Society for Microbiology. All Rights Reserved.
,
Instituto de Tecnologia Química e Biológica, Universidade Nova de Lisboa, Apartado 127, 2781-901 Oeiras, Portugal,1 European Synchrotron Radiation Facility, Bp-220, F-38043 Grenoble Cedex, France,2 Institute for Biotechnology and Bioengineering, Center for Biological and Chemical Engineering, Instituto Superior Técnico, 1049-001 Lisbon, Portugal3
Received 19 February 2007/ Accepted 2 March 2007
|
|
|---|
|
|
|---|
Gellan gum is available commercially in three chemical forms, no, low, and high acyl content, with the respective denominations of Gelrite, Kelcogel F, and Kelcogel LT100. Gelrite is used as a substitute for agar in microbiological and tissue culture media. The Kelcogels are food-grade gellan gums mainly used as gelling agents in foods and personal care applications. Other biomedical applications of gellan include its use as a pharmaceutical excipient for nasal and ocular drug delivery applications (32) and as material for the construction of three-dimensional (3-D) scaffolds for tissue engineering (10).
The commercial applications of gellan have been a stimulus to study its biosynthesis. The production pathway is a multistep process starting with the intracellular formation of the nucleotide-sugar precursors, UDP-Glc, UDP-GlcA, and dTDP-L-Rha. This is followed by the formation of the repeat unit, with sequential transfer of the sugar donors to an activated lipid carrier by committed glycosyltransferases and, ultimately, by gellan polymerization and export (28). The genes coding for the proteins involved in the synthesis of dTDP-L-Rha, the glycosyltransferases, and the proteins required for gellan polymerization and export are located together in the gel cluster (28, 34). However, the genes (pgmG, ugpG, and ugdG) coding the enzymes required for the synthesis of the nucleotide sugars (UDP-Glc and UDP-GlcA) from Glc-1-phosphate (G1P) are located at different positions on the S. elodea genome (25, 28, 34), perhaps reflecting the broader functions of these enzymes beyond the specific gellan pathway.
G1P uridylyltransferase (UGP) (EC 2.7.7.9) mediates the reversible conversion of G1P and UTP into UDP-Glc and pyrophosphate. This enzyme is widely distributed throughout all domains of life, although the eukaryotic UGPs are unrelated in terms of amino acid sequence to their prokaryotic counterparts (13, 14, 16). UGPs belong to the superfamily of nucleoside diphosphate sugar pyrophosphorylases. The protein sequence most closely homologous to that of bacterial UGP is G1P thymidylyltransferase (RmlA) (EC 2.7.7.24), a bacterial enzyme that mediates the reversible production of dTDP-Glc from G1P and dTTP, the first of a four-step pathway necessary for the formation of the nucleotide sugar dTDP-L-Rha (30). The protein under study, UGP from S. elodea (UgpG), recognizes both UTP and dTTP nucleotides as substrates in vitro (30), while other UGPs have been reported as being unable to use dTTP as a substrate in vitro (5). Studies have been performed to understand why some of these proteins are promiscuous toward both their nucleoside triphosphate (NTP) and the sugar substrates (3, 4), but the reasons are yet to be established. We have previously described the kinetic properties (25, 30) and preliminary structure determination of UgpG from the industrial microorganism S. elodea ATCC 31461 (2). Recently, the coordinates of a crystallographic structure of Escherichia coli UGP were deposited in the PDB under accession number 2E3D, but no associated publication was reported. The study presented here reports the structure analysis of UgpG and highlights the main differences between UGPs and RmlAs.
|
|
|---|
Data collection, structure determination, and refinement. X-ray diffraction data were collected at the European Synchrotron Radiation Facility (ESRF), Grenoble, France, beamlines ID14-2 and ID29. The diffraction images were integrated with MOSFLM (24) and the intensities scaled, merged together, and reduced to structure factor magnitudes using the CCP4 suite of programs (8), which was also used for further data handling and scaling purposes. Reflections for the calculation of Rfree were grouped in thin resolution shells, due to the presence of eightfold noncrystallographic symmetry (NCS). Diffraction data statistics are presented in Table 1.
|
View this table: [in a new window] |
TABLE 1. Refinement statistics for UgpG at 2.65 Å
|
Comparison of 3-D structures. All of the RmlA and the E. coli UGP coordinate files deposited in the Protein Data Bank were aligned with the UgpG structure presented here by using the secondary structure matching algorithm (22) implemented in COOT (15). This initial alignment was further refined with MODELLER (29), with a threshold of 3.5 Å for interprotein residue matching.
Since this paper was submitted, the 1.9-Å-resolution description of the glucose-1-phosphate uridylyltransferase from E. coli has been published (32a).
Protein structure accession numbers. The coordinates and structure factors of UgpG have been deposited in the Protein Data Bank (PDB accession numbers, 2UX8 and 2UX8SF).
|
|
|---|
electron density maps, probably due to disorder. The asymmetric unit consists of two tetramer units, with one G1P molecule per UgpG monomer and a total of 160 modeled water molecules.
An enzyme gel filtration profile and the results of dynamic light scattering experiments show a single species in solution of a mass of
120 kDa (data not shown). This corresponds to approximately four times the expected molecular mass of the 289-residue protein monomer and suggests that the crystallographic tetramers are the functional units of UgpG.
The refined structure converged to a final Rwork and Rfree of 24.5% and 29.5%, respectively. A stereochemical analysis of the refined model using PROCHECK (23) revealed 86%, 13%, 1%, and 1% residues in the most favorable, additional allowed, generously allowed, and disallowed regions of the Ramachandran plot, respectively. The residues in the disallowed areas are partially supported by the electron density maps and lie in regions with high B factors. The overall geometry was better than for the set of PROCHECK comparative structures at a resolution of 2.65 Å, with a G factor of 0.13.
Overall polypeptide fold.
The monomer (Fig. 1a) has an overall size of around 60 Å, 45 Å, 35 Å and is classified by CATH (27) as an alpha/beta complex fold (CATH classification of 3.90). Each monomer is made up of 10 helices and 15 ß-strands (20), labeled consecutively from 1 to 10 and A to O, respectively (Fig. 1b). The UgpG monomer is built by a large, mixed ß-sheet, reminiscent of the Rossman fold but including nine ß-strands, spanning the whole molecule and surrounded by10 helices and two additional anti-parallel ß-sheets. The molecule may be divided into three functional regions or subdomains (12). The nucleotide binding subdomain (NBSD) (residues 1 to 132) is composed of the first four ß-strands of the main ß-sheet sandwiched between their interconnecting helices. It resembles other nucleotide binding domains (6) and is strongly involved in intermonomer interactions. Subdomain 2, the sugar binding subdomain (SBSD) (residues 133 to 256), binds G1P and is the most compact of the three subdomains. It is composed of the second half of the main ß-sheet, including five mixed strands, the two extra ß-sheets with two and four strands, and four
-helices. Finally, subdomain 3, the dimerization subdomain (DSD) (residues 257 to 289) is composed of two
-helices arranged like a "V" and, together with subdomain 1, is responsible for the formation of the active tetramer (see below).
![]() View larger version (46K): [in a new window] |
FIG. 1. UgpG monomer. (a) Ribbon representation of UgpG showing G1P in ball-and-stick representation (black). The molecule may be divided into three functional regions, the NBSD, the SBSD, and the DSD, which are colored in light gray, medium gray, and dark gray, respectively. (b) Topological diagram of UgpG using the same color scheme as described for panel a for subdomain identification. Arrows represent ß-strands and cylinders represent helices. (c) Superposition of UgpG (black ribbon) and RmlA structures (gray) showing their overall similarity. The loop consisting of residues 219 to 229 lies above the catalytic cavity. The labeled helix bundle, only present in the RmlA structures, contains a second allosteric NTP binding site.
|
's within 0.2 to 0.4 Å for 237 positions to a cutoff of 3.5 Å. This assembly is responsible for an overall buried surface area upon tetramer formation of 12 kÅ2, slightly larger than the 10-kÅ2 surface of one monomer alone (calculated using AREAIMOL) (8). The interdimer assembly that produces the tetramer (Fig. 2a and b) is dominated by the stacking of helices 2 and ß-strand C between monomers A and B (in a total of 36 to 46 van der Waals interactions [vdW]; 13 to 20 hydrogen bonds). The interdimer interfaces are strongly bound due to two particular interactions of the NBSD and DSD. From the NBSD, residues 15 to 39 form a set of hydrogen bonds (defined as such up to 3.2 Å) and favorable apolar interactions (vdW; defined as such up to 3.6 Å) with their symmetry equivalents, and in particular, methionines 27 and 31 and 27' and 31' create a four-methionine apolar region (Fig. 2c). These interactions are responsible for 28 to 52 vdW interactions and 6 to 10 hydrogen bonds between monomers, depending on the particular dimer. The DSD is responsible for the helix-helix interactions between helices 9 and 10 and 9' and 10', with one set with the "V" shape fitted into the inverted "V" of the neighbor monomer (Fig. 2b, middle top and middle bottom, and d). This interaction is responsible for 8 to 17 vdW interactions and one or two hydrogen bonds within each dimer. The two interactions account, on average, for 90% of the total vdW interactions and 70% of the total hydrogen bonds that create the dimer. Upon tetramer formation, the parallel alignment of the C ß-strands allows the extension of the monomers' wide ß-sheet into two interdimer supra-ß-sheets (Fig. 2e, stacks of near-parallel strands represented by lines), contributing to the tetramer stability. These interactions account for 82% of the vdW interactions and 98% of the hydrogen bonds of the total dimer-dimer contact interface.
![]() View larger version (58K): [in a new window] |
FIG. 2. UGP tetramer. (a, b, and d) Perpendicular views of the UgpG oligomer, colored according to the monomers, in ribbon representation, with G1P in ball-and-stick representation. Dimers AA' and BB' are represented by the green and magenta and the yellow and blue monomers, respectively. While panel b shows the whole tetramer, panels a and d represent only the front monomers, for purposes of clarity. (c) A-A' interactions (residues 15 to 37) with residues 27 and 27' and 31and 31', creating a four-methionine apolar region. (e and f) Schemes of UGP and RmlA tetramer arrangements, respectively. The rectangles represent monomers, color coded as described for panel b. The lines inside the monomer rectangles represent ß-strands, which in UGP form an extended ß-sheet across monomers A and B. The schemes highlight the different quaternary structures adopted by UGP and RmlA.
|
![]() View larger version (106K): [in a new window] |
FIG. 3. 3-D-structure alignment of all available RmlA and UGP structures upon superposition with MODELLER (29). Residues boxed in white form H bonds with G1P, residues highlighted in gray form H bonds with nucleotides, and residues highlighted in black constitute a hydrophobic cap to the base of the sugar ring at the catalytic cavity. RmlA PDB accession numbers 1IIN, 1IIM, 1MP3, 1MP4, and 1MP5 are from Salmonella enterica; 1G0R, 1G23, 1G3L, 1G1L, and 1FXO are from Pseudomonas aeruginosa; 1H5R and 1MC3 are from E. coli; and 1LVW is from Methanobacterium thermoautotrophicum. UGP PDB accession numbers are 2UX8 (S. elodea) and 2E3D (E. coli).
|
![]() View larger version (57K): [in a new window] |
FIG. 4. UgpG catalytic cavity. (a) UgpG transparent solvent-accessible surface around the catalytic cavity, with bound G1P (ball-and-stick) and catalytic H-bonding residues (sticks). On the left of G1P, there is enough space for an NTP molecule. The side chains, atoms of residues M109 and V204, and the main chains, atoms of residues Q105, D133, D134, and E191 were removed for clarity.(b) G1P H-bonding network with UgpG catalytic residues, where dashed lines represent proton-donor to proton-acceptor distances of up to 3.2 Å. (c) The Rmla-deoxythymidine complex structure 1H5R (36) with its H-bonding network. (d) A deoxythymidine molecule was manually fitted, using a molecular graphics workstation, into the empty UgpG catalytic cavity using the complex structure shown in panel c as a template. M109 is represented by its main-chain atoms only, for purposes of clarity. Carbon is represented in black, oxygen in white, nitrogen in light gray, and phosphorus in dark gray.
|
Structural neighbors.
A structural similarity search using DALI (17) against the UgpG monomer shows a number of structurally related proteins, many of which belong to the superfamily of bacterial nucleotide diphosphate sugar pyrophosphorylases. The 12 structures with z scores between 22 (best) and 14 (worst) and RMSDs from 2.2 to 3.3 Å, respectively, are shown in Table S1 of the supplemental material. The model with the highest z score and, as it happens, the highest sequence identity (21% for 214 3-D homologous C
positions) to UgpG is PDB accession number 1G0R, a Pseudomonas aeruginosa RmlA structure complexed with thymidine and G1P. Many of the other high-scoring structures share the NTP binding subdomain with UgpG.
Modeling of nucleotide binding. The structural superpositions described above show that the proteins' NBSDs are very similar. This fact, together with the existence of several complexes with nucleotides, was used to help model thymidine in the UgpG catalytic cavity in the proximity of the bound G1P. Using the E. Coli RmlA structure complexed with both deoxythymidine and G1P (PDB accession number 1H5R [36]) as a reference (Fig. 4c), a homology model was built, as shown in Fig. 4d, that shows that also in UgpG thymidine is also likely to form hydrogen bonds with residues A14, G15, N105, M109, G110, and D133 in UgpG. UgpG can employ either thymidine or uradine as substrates. They differ in their structures only by a methyl group, and our structure has space to accommodate it, leading to little specificity between the two bases. One can expect the binding of either nucleotide to be very similar.
|
|
|---|
The 3-D structural comparison of the monomers of 13 published X-ray crystal structures of RmlAs with UgpG shows similar monomer folds, with RMSDs within 1.5 and 1.7 Å for more than 200 fitted C
positions with 3-D structural identities below 20% (see Table S2 in the supplemental material). Indeed, the sequence analysis made by Silva et al. (30) of 17 RmlA proteins and three putative or confirmed UGPs showed the two enzymes to form two divergent groups in a phylogenetic tree. The UgpG structure strongly suggests that dTDP and UDP Glc pyrophosphorylase enzymes share a common3-D structure, as Silva et al. suggested, albeit with low sequence conservation. Local inspection of the structure superpositions shows that the active-site cavities and side-chain spatial orientations of the catalytic residues are conserved (Fig. 3 and 4). A loop consisting of residues 219 to 229 and located between helices 7 and 8 closes the active site in all of the RmlA structures, whether substrates are bound or not, while in the UgpG structure, this loop adopts an "open" conformation, allowing access to the active site where G1P is already bound in place (Fig. 1c). It should be noted, though, that this loop in the UgpG structure is visible in the electron density maps and was modeled unambiguously for only one monomer. The lack of clear electron densities for the remaining monomers is a sign of the mobility of this loop, which acts as gatekeeper to the active site and which by holding both substrates may then help to stabilize the ternary complex.
When compared to UgpG, all RmlAs have a C-terminal extension of 30 residues forming a three-helix bundle that is absent in UgpG (Fig. 1c, bottom right). The volume corresponding to the missing helix bundle is partially occupied by helix 3 in UgpG. This region is particularly significant as it accommodates a second, presumed to be allosteric, NTP binding site on some RmlAs (31). There is, however, no biochemical, sequence, or structural evidence to date that UgpG has a similar second site.
The results of the DALI search to locate structural neighbors of the UgpG monomer show that similar proteins cover diverse quaternary structures with monomers and multimeric associations to form dimers and tetramers. However, none of the tetrameric structures shows a quaternary arrangement similar to that found in UgpG, which reveals a new type of oligomerisation compared to any of its DALI neighbors and, indeed, to any of the RmlA structures published to date. Oligomers 1G0R (RmlA (4) and 1QWJ (murine CMP-5-N-acetylneuraminic acid synthetase (21) have RmlA-like tetrameric arrangements (see below), and the tetrameric arrangement of 1YP2 (ADP-Glc pyrophosphorylase) (19) is different from that of UgpG due to an extra subdomain in 1YP2 that is the principal component for forming its monomer-monomer interactions.
While the UgpG tetramer has the four symmetrically related helices 2, one from each monomer, arranged in the center of the tetramer, all RmlA structures show their oligomerization equivalent to be helix 9. The interface that creates the dimer in UgpG remains the same as in RmlA, but the interface that builds the dimer of dimers is from the opposite side of the monomer (Fig. 2e and f). Consequently, the RmlA tetramers do not contain the long continuous stack of ß-chains between adjacent monomers (see "Quaternary structure and oligomerisation interfaces" in Results).
A comparison of UgpG with the recently deposited E. coli UGP (PDB accession number 2D3D) shows a 1.4 Å RMSD between homologous C
's to a cutoff of 3.5 Å, including 238 positions with a 35% identity in a 3-D-based sequence alignment. It is noteworthy that none of the four crystallographically independent molecules in 2D3D could be completely modeled. Most significantly, their ternary and quaternary structures coincide with those of UgpG.
The analysis of the UgpG catalytic cavity structure and its comparison with that of known RmlA structures reveal considerable similarities in sugar and nucleotide binding, as might be expected for similar catalytic functions. Moreover, in mechanistic terms, conserved residues can be identified in 3-D positions equivalent to those proposed to act as the catalytic residues in RamlAs, which points to a similar catalytic mechanism. Although space is available at the catalytic cavity of UgpG to accommodate either UTP or TTP, the presence of a flexible loop (residues 219 to 229) that could close the active site and the results of calorimetry suggest that protein conformation changes may play an important role in the different specific activities observed in vitro (25, 30), which may be determining for the UgpG preference for UTP.
D.A. acknowledges a grant from FCT, SFRH (BD/6480/2001). This work was partially supported by FEDER and Fundação para a Ciência e a Tecnologia (FCT), Portugal (POCTI/BME/44441/2002 and POCTI/BIO/58041/2004).
Published ahead of print on 13 April 2007. ![]()
Supplemental material for this article may be found at http://jb.asm.org/. ![]()
|
|
|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»