Joseph Krahn,1,
Byung Sik Shin,2 Diana R. Tomchick,1,
Howard Zalkin,2 and Janet L. Smith1*
Departments of Biological Sciences,1 Biochemistry, Purdue University, West Lafayette, Indiana 479072
Received 31 January 2003/ Accepted 29 April 2003
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
-D-5-phosphoribosyl-1-pyrophosphate (PRPP) and for synthesis of AMP from IMP (10, 26, 38). The two-step pathway from IMP to GMP is separately regulated. The PurR-PurBox system also regulates transcription of the purR operon (purR and yabJ) (38) and of several genes involved in cofactor biosynthesis (glyA and folD), purine salvage (guaC and xpt), and purine transport (pbuG, pbuO, and pbuX) (28). B. subtilis PurR is a 285-residue, homodimeric protein (38). The PurR sequence includes a 13-residue PRPP-binding motif (Fig. 1) that is characteristic of the PRT protein family (32, 38). Most PRT family members are phosphoribosyltransferases (PRTases) with a PRPP substrate, but no catalytic activity has been detected for PurR (H. Zalkin, unpublished data). Precedent for adaptation of the PRT fold to a regulatory function occurs in PyrR, a transcription attenuator for pyrimidine biosynthetic genes in B. subtilis (35, 36). PurR is about 70 amino acids longer at the N terminus than most PRT proteins. This region is a proposed DNA interaction domain (38), although at the sequence level it is not obviously related to any other DNA-binding protein.
|
PRPP appears to be the inducer of genes regulated by PurR because it is the only molecule among many nucleobases, nucleosides, and nucleotides known to affect PurR-DNA binding in vitro (38). This is consistent with the PRPP-binding sequence motif in PurR. In vivo, adenine affects PurR-mediated repression of transcription initiation (9). Excess adenine signals are thought to be transmitted to PurR via the cellular PRPP pool (38).
Homologs having 40 to 65% sequence identity with B. subtilis PurR occur in 20 other gram-positive bacteria (Fig. 1). However, functional studies have been reported for only the Lactococcus lactis homolog, which also regulates transcription of purine genes. L. lactis PurR activates transcription of purC and purDEK (15). PurBoxes located 76 nucleotides upstream of purC and purD are required for transcription activation, with the conserved G in the central GAAC sequence being essential. A second, tandem PurBox is located 93 nucleotides upstream of the purC start site but is not required for transcription regulation. PurR appears to autorepress transcription of purR in L. lactis (16).
B. subtilis PurR is unrelated to the more familiar purine repressor from Escherichia coli, also called PurR. E. coli PurR is a member of the LacI family and binds to a well-defined 16-base-pair DNA palindrome via an N-terminal helix-turn-helix (HTH) domain. Hypoxanthine and guanine are corepressors that increase the DNA affinity of E. coli PurR. B. subtilis and E. coli PurR proteins belong to different protein families, bind different DNA sequences, and regulate transcription by different mechanisms. With its PRT domain, B. subtilis PurR represents a new protein family among bacterial repressors that is thus far limited to gram-positive eubacteria.
Here we report the 2.2-Å crystal structure of B. subtilis PurR. Surfaces involved in DNA and PRPP binding are apparent in the dimer of winged-helix and PRT domains. The structure provides a foundation for understanding the mechanism of transcription regulation by PurR (4).
| MATERIALS AND METHODS |
|---|
|
|
|---|
, 300 mM NaCl] and well solution (5% polyethylene glycol [PEG] 8000, 66.7 mM HEPES [pH 7.0], 167 mM Li2SO4). SeMet PurR was crystallized under similar conditions from a 1:1 mixture of protein solution [10 mg of PurR per ml, 10 mM HEPES (pH 8.0), 50 mM
, 300 mM NaCl, 5 mM dithiothreitol] and well solution (5% PEG 8000, 83.3 mM HEPES [pH 7.0], 450 mM Li2SO4), which was equilibrated for 3 days and then streak seeded with crystals of wild-type PurR. Platelike crystals grew to an average size of 0.2 by 0.4 by 0.05 mm in approximately 1 month.
Data collection.
The crystals were harvested in 200 µl of well solution and cryoprotected by five successive exchanges of 50 µl of harvesting solution with an equal volume of cryosolution (20% PEG 400, 10% PEG 8000, 200 mM NaCl, 500 mM Li2SO4, 51 mM HEPES [pH 7.0]) every 30 s. The crystals were immediately flash frozen in a gaseous N2 stream at 100 K. Frozen crystals had a high mosaicity (
1.3°), which was reduced to
0.55° by cryoannealing. The crystal was thawed by immersion in the final cryoprotectant solution for 2 min and flash frozen again in a gaseous N2 stream at 100 K. All data used to solve the structure were collected from a single SeMet PurR crystal. Multiwavelength anomalous diffraction (MAD) data were recorded at beamline BM-14 at the European Synchrotron Radiation Facility (ESRF). X-ray fluorescence spectra from a SeMet crystal were recorded from 12.65 to 12.68 keV to select wavelengths for data collection. Due to beam time limitations, data sets for only two wavelengths at the Se K edge (249° at the inflection point,
1, 12.6645 keV, 0.9790 Å; and 237° at the peak,
2, 12.6671 keV, 0.9788 Å) were recorded at the ESRF on a Mar 345 imaging plate detector with 60-s exposures over 1° of crystal rotation per image. Several days later, data were collected from the same crystal (326.4° at the remote,
3, 8.0416 keV, 1.5418 Å) using CuK
radiation and an R-axis IIc imaging plate with 30-min exposures over 0.6° of crystal rotation per image. The useful data were limited to 294°, as after 8 days of data collection, diffraction quality deteriorated dramatically due to crystal decay and severe ice formation. Data were processed with the HKL package (24). Unit cell parameters in space group P1 are a, 65.1 Å; b, 72.2 Å; c, 83.0 Å;
, 84.8°; ß, 84.0°; and
, 67.5° with four copies (Vm = 2.8 Å3/Da,
57% solvent) of the PurR polypeptide per asymmetric unit. The MAD data set used for phasing was produced by scaling the
2 and
3 data sets to the merged data for
1 by using SCALEIT from the CCP4 suite (7). Data quality is summarized in Table 1.
|
1 data set.
Model building and refinement.
The model was built into a 2.7-Å electron density map by using the program O (14). The initial model was 83.5% complete and had an Rwork value of 0.483 and an Rfree value of 0.477 for all data between 20.0 and 2.4 Å. For two iterations of refinement, the model was refined against the 2.4-Å
1 data set with maximum-likelihood amplitude targets by using the Crystallography and NMR system (5). Subsequently, the model was refined with maximum-likelihood amplitude and phase probability targets in the Crystallography and NMR system against a wavelength-combined 2.2-Å data set (
1+
2+
3, Table 1). This data set, generated by merging data at
1 (20 to 2.2 Å),
2 (20 to 2.2 Å), and
3 (30 to 3.1 Å), was more complete at low resolution and more redundant overall. Numerous iterations of manual rebuilding and simulated annealing, using models with different omitted regions, were required to complete the model. The model of the protein core was restrained by fourfold noncrystallographic symmetry during all cycles of refinement except the last three. Atomic occupancies of the sulfate ions, HEPES molecules, and protein residues in dual positions were each refined once. Final Rwork and Rfree values for all data between 30.0 Å and 2.2 Å were 0.188 and 0.237, respectively. The final refined model consists of two PurR dimers, AB and CD, related by an approximate twofold screw axis, with 270 residues in monomers A and D and 269 residues in monomers B and C, eight sulfate ions, two HEPES molecules, six fragments of PEG, and 568 water molecules. Model quality is summarized in Table 2. This model has been deposited in the Protein Data Bank and is available with accession code 1p41.
|
|
|
|
|
|
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
Protein structure. The PurR polypeptide folds into two domains connected by a short three-residue linker (Fig. 2B). The domain interface buries 541 Å2 of surface area in each monomer. The N-terminal domain comprises residues 1 to 74, and the C-terminal domain includes residues 77 to 285 (Fig. 3).
The N-terminal domain is a winged-helix domain, a subdivision of the HTH structural family (Fig. 3A). An HTH domain had not been predicted from the PurR sequence. The nearest structural neighbor is the winged-helix domain of the biotin operon repressor BirA (39), with an RMSD value of 1.5 Å over 60 C
positions and 16% sequence identity (Fig. 3A). The PurR winged-helix domain consists of the canonical arrangement of secondary structures:
1-ß1-
2-T-
3-ß2-W-ß3, where
2-T-
3 is the HTH motif,
3 is the recognition helix, and W is the wing (Fig. 1 and 2B). Most winged-helix proteins have a second wing following ß3, but in PurR, a three-residue linker in this position connects to the C-terminal domain.
The C-terminal domain of B. subtilis PurR has a 13-residue sequence motif for PRPP binding, which is characteristic of the PRT family (Fig. 1) (32). As expected, this domain has the PRT fold consisting of a central parallel ß sheet flanked by
helices (Fig. 2B). Secondary structures at the N and C termini of the PRT domain form a hood over the C-terminal edge of the central ß sheet, as occurs in other PRT family members (Fig. 1 and 2B). Among the PRTs, B. subtilis PurR is most closely related to the adenine PRTases (APRTs). At the sequence level, the 21% identical wheat APRT is the closest homolog. The closest structural homolog among 38 other PRTs of known structure is the Leishmania donvanii APRT (25), with an RMSD of 1.7 Å for 151 C
positions (Fig. 3B).
PurR crystallized as a symmetric dimer (Fig. 2A). Both domains contribute to a substantial dimer interface. The interface between the two N-terminal winged-helix domains consists of a well-packed core of 22 hydrophobic residues and three pairs of peripheral hydrogen bonds, burying 16% (747 Å2) of the total domain surface area. In contrast, the interface between the two C-terminal PRT domains is less tightly packed and consists of a hydrophobic core of 20 residues burying 8% (788 Å2) of the total domain surface area. The dimer found in the crystal clearly corresponds to the solution dimer because of the extensive hydrophobic interface between subunits. There are no significant differences between the AB and CD dimers in the crystallographic asymmetric unit. However, subunits A and D have similar crystal lattice environments, as do subunits B and C, and these subunits therefore have more similar atomic B-factor distributions, disordered residues, and atomic positions (Table 2).
Winged-helix domain and DNA binding. The HTH proteins are the most thoroughly studied group of prokaryotic DNA-binding proteins. They exhibit a remarkable variety of schemes for sequence-specific DNA recognition and binding. While a DNA-PurR complex cannot be modeled from these other proteins, available biochemical, sequence, and structural information limits the possibilities for PurR.
DNA sequence recognition is generally achieved by interaction of the HTH recognition helix with the major groove of DNA (for example, reference 1). Winged-helix proteins generally follow this pattern, and in addition, one or both of the wings contact the DNA minor groove (for example, reference 6). However, this binding mode is not universal. There is considerable variability in the length, structure, and function of the wings in these proteins. For example, DNA sequence recognition by eukaryotic transcription factor RFX1 occurs by insertion of the wing into the major groove, while only one side chain of its recognition helix binds in the minor groove (12). In contrast, the wings of the catabolite activator protein do not interact with DNA (29).
Most HTH proteins are dimeric and bind to DNA inverted repeats with aligned protein and DNA symmetries. However, in contrast to canonical HTH or winged-helix proteins, two or more PurR dimers bind one inverted repeat, which encompasses considerably more DNA than in complexes with other HTH or winged-helix proteins. Precedent for two protein dimers binding one DNA inverted repeat is found in the multidrug-binding repressor QacR (30). Typical HTH proteins bind inverted repeats of about 20 base pairs, with recognition helices bound in adjacent major grooves on one side of the DNA duplex. The DNA inverted repeats recognized by PurR encompass more than 40 base pairs (14 base pairs per repeat and 16 to 17 intervening base pairs), and more than 70 base pairs are required for high-affinity protein binding (4).
Among winged-helix proteins, the extensive hydrophobic contact between domains is unique to PurR. The hydrophobic interface, which is continuous with the hydrophobic cores of the two domains, is formed by
1 and the recognition helix,
3. Close association of winged-helix domains is a common feature of PurR homologs because amino acid side chains in the interface (Leu9, Val10, Thr13, and Leu17 in
1 and Ile46, Ile47, Thr50, and Phe51 in
3) are conserved as hydrophobic in 21 eubacterial sequences. This seems to rule out control of DNA affinity by a classic allosteric switch in which relative positions of the reading heads change in response to PRPP binding in the PRT domain.
The most positively charged surface of HTH and winged-helix domains is consistently the DNA binding surface (11). This is striking in RFX1, where the wing rather than the recognition helix has a positive surface and binds DNA (12). The most positively charged surface of the PurR winged-helix domain (Fig. 4) is formed by the N-terminal tail (amino terminus, Lys2, and Arg4) and beginning of
1 (Arg5 and Arg8). These residues from both subunits form a positively charged swath across the bottom surface of the PurR dimer (Fig. 4). The large positive surface is bounded by the carboxyl groups of Glu42 in the recognition helix,
3, separated by 24 Å across the dimer interface. A sulfate ion from the crystallization solution bound to Lys2, Arg8, Ser35, and the backbone of Phe3 demonstrates that this region is a suitable site for interaction with DNA phosphate. If DNA binds to the winged-helix domains of PurR, it surely binds to this positively charged surface.
The function of PurR homologs is unknown apart from the L. lactis transcription activator, also called PurR. However, a strong case is made for a common DNA-binding function by mapping conserved residues onto the structure (Fig. 5). All eubacterial homologs have the winged-helix domain and are thus presumptive DNA-binding proteins. The positively charged surface is conserved among the 21 eubacterial sequences. The length of the N-terminal tail varies (i.e., from 2 to 6 residues), but always includes one or two positively charged side chains in addition to the amino terminus (Fig. 1). Invariant residues cluster at the positively charged surface (Arg5 and Arg8), the N terminus of the recognition helix
3 (Lys37, Ser38, Ser40, Glu42, and Asp43), and the wing (Gly64, Gly67, and Gly68). These regions of the domain form a continuous surface at the bottom of the PurR dimer, with the positive charges at the center and the wings in the outermost positions (Fig. 4).
PRT domain and PRPP binding.
Most PRT family members are PRTases and catalyze the displacement of the
1-pyrophosphate of PRPP by a nitrogen-containing nucleophile to produce a ß1-substituted ribose-5-phosphate and free pyrophosphate (PPi). All PRTs have similar PRPP binding sites, comprising three structures at the C-terminal edge of the central ß sheet (Fig. 6A). A PPi loop (PurR residues 138 to 141) follows the first ß strand of the central sheet and contains a characteristic nonproline cis peptide between Ala138 and Thr139 (Fig. 1). A long flexible loop (residues 160 to 188) between the second and third ß strands is fully ordered in other PRTs only when closed over bound PRPP. A PRPP loop (residues 203 to 211) is within the PRPP-binding motif and follows the third ß strand of the central sheet. The structure of the PRPP-binding motif (residues 199 to 211) is extremely well conserved (Fig. 3B), with pairwise RMS deviations of less than 1 Å for thirteen C
atoms from PurR and various other PRTs. The PPi and PRPP loops are adjacent and form the topological switch point of the PRT fold.
Residues constituting these three structural regions are remarkably well conserved among PurR and its eubacterial homologs (Fig. 1), consistent with a common function. In addition to residues expected to contact PRPP directly, a second tier of conserved residues forms specific contacts to maintain precise positions for the three critical loops. For example, a hydrogen bond between the side chain of invariant Thr136 and the backbone carbonyl of conserved Ile202 anchors the PPi loop to the PRPP loop; invariant Gly141 keeps the PPi loop near invariant Asp203 and Asp204 in the PRPP loop; and invariant Gly210 and Gly214 permit the flexible loop to pack closely against the PRPP loop.
In PurR, as in many other PRT structures lacking PRPP, a sulfate ion from the crystallization solution mimics the binding of PRPP 5-phosphate (Fig. 6A) through hydrogen bonds with the amides of Met206, Lys207, Ala208, Gly209, invariant Gly210, and invariant Thr211 in the PRPP loop, with the side chain of Thr211, and with the amide of conserved Gly178 in the flexible loop. The sulfonate of a HEPES buffer molecule in subunits A and D and a sulfate ion in subunits B and C mimic the ß-phosphate of PRPP pyrophosphate. This sulfate/sulfonate forms hydrogen bonds with side chains and backbone amides of invariant cis-Thr139 and Lys140 in the PPi loop and an ionic contact with conserved Lys161 of the flexible loop. Two hydrogen bonds with the sulfate/sulfonate are formed by the side chain of invariant Arg160 from the flexible loop of the partner subunit in the dimer (Fig. 1 and 6A). All of these interactions with sulfate or sulfonate are characteristic of PRPP binding to other PRT proteins.
The PRT flexible loop of PurR is ordered through most of its length, forming an antiparallel ß ribbon (ß8 and ß9 and an eight-residue connecting loop). The ß ribbon partially covers the PRPP binding site through the hydrogen bond between Gly178 and the sulfate bound in the PRPP loop. A similar ß ribbon conformation also occurs in the flexible loop of L. donovanii APRT (25), the nearest structural neighbor of PurR, although the loop is more open in APRT. Unlike other PRTs, the disordered region of the PurR flexible loop is not the tip of the loop but rather the connection between the PRT core and the first ß strand of the loop (residues 163 to 167).
The presumed PRPP binding sites are clefts at the top of the PurR dimer separated by about 15 Å. Among the 21 eubacterial PurR homologs, invariant residues cluster around these sites consistent with a common PRPP-binding function for all of the proteins (Fig. 5). The invariant residues from the two subunits form a highly conserved top surface for the PurR dimer.
PRT family members have a hood above the central ß sheet. In PurR, the hood (residues 92 to 110 and 252 to 264; Fig. 2B) does not resemble the hood of any other PRT protein, nor does it resemble any other protein motif. The hood of PRTases is responsible for binding the nitrogen-containing nucleophile cosubstrate (ammonia, adenine, guanine, hypoxanthine, xanthine, orotate, and uracil). PurR appears incapable of binding a nucleophile because conserved Tyr102 fills the space normally occupied by the nucleophile substrates in PRTases (Fig. 6B). Contacts of Tyr102 with invariant Phe205 in the PRPP loop fix the position of the tyrosine ring. The PurR hood is also stabilized by several interactions of conserved residues, including a bidentate salt bridge between invariant Arg96 and Asp107, a hydrogen bond between the carbonyl of invariant Gly100 and conserved Lys207 in the PRPP loop, and several hydrophobic interactions. Thus, the PurR hood seems designed to prevent nucleophile binding. The inability to bind a nucleophile explains both the observed absence of phosphoribosyltransferase activity in PurR and the inability of molecules with nucleobases to disrupt PurR-DNA binding (38).
DNA binding to PurR. How does the PurR dimer bind specifically to control site DNA? Using other winged-helix and HTH proteins as a precedent, we presume that the positively charged surface of a PurR winged helix dimer recognizes the conserved CGAA sequence in the center of a DNA PurBox. However, PurR protection of long stretches of control site DNA suggests that high-affinity DNA binding requires nonspecific interactions with additional regions of the winged-helix and/or PRT domains. Positively charged surfaces on the front, bottom (winged-helix domains) and top (presumed PRPP binding sites) of the PurR dimer are potential DNA binding sites (Fig. 4). Use of multiple binding surfaces also would explain both the requirement for more than 70 base pairs of control site DNA and the alternating pattern of DNase I protected and hypersensitive sites. Similar modes of DNA binding have been proposed for winged-helix proteins such as OmpR (20) and ArgR (33).
The presumed PRPP binding site and flexible loop are candidates for direct DNA interactions with the PRT domain. These structural elements in the two subunits form a large, electropositive, conserved surface on the top of the PurR dimeropposite the positive surface of the winged-helix domains (Fig. 4 and 5). The DNA backbone may bind directly in adjacent PRPP sites of the dimer, as these sites are tailored for a molecule with ribose and phosphate moieties. Local PRPP-induced changes in the top electropositive surface may reduce DNA affinity enough to relieve repression by PurR. For example, binding of electronegative PRPP would greatly reduce the basicity and DNA affinity of the top surface of the dimer. Conformational changes in the flexible loop, such as those induced by PRPP in other PRT proteins, would further alter the surface encountered by DNA. Additionally, PRPP may induce conformational changes at domain interfaces, thus modulating DNA binding to additional sites on the PurR dimer outside the winged-helix domains. Large PRPP- or DNA-induced changes at the dimer interface of the winged helix domains are unlikely because of the well-packed, compact, hydrophobic core between domains. However, the interfaces between the winged helix and PRT domains within a monomer and the two PRT domains within a dimer are less well packed and may be subject to PRPP- or DNA-induced changes.
Shin et al. proposed that DNA wraps around PurR based on their finding that PurR induces right-handed supercoils in DNA (31). This is consistent with the conserved and positively charged bottom and top surfaces of the PurR dimer (Fig. 4 and 5). The B. subtilis PurBox sequences (28) are remarkable in the high AT content bordering the central conserved CGAA motif. These sequences could facilitate DNA bending in the outer regions of the PurBoxes. The possibility of such a deformation of DNA is also consistent with the results of torsional constraint experiments (31).
Binding of two or more PurR dimers to palindromic PurBoxes is highly cooperative. Cooperative DNA binding could be achieved through DNA deformation, as in the DNA complexes of winged-helix proteins RFX1 (12) and QacR (30) or by protein-protein contacts such as those in the DNA complex of the winged-helix protein FadR (37, 40). The structure and binding properties of PurR are consistent with either mechanism of cooperativity.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Present address: Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, TX 75390. ![]()
Present address: National Institute of Environmental Health Sciences, Research Triangle Park, NC 27709. ![]()
Present address: Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX 75390. ![]()
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Appl. Environ. Microbiol. | Infect. Immun. | Eukaryot. Cell |
|---|---|---|
| Mol. Cell. Biol. | J. Virol. | Microbiol. Mol. Biol. Rev. |
| ALL ASM JOURNALS |