Cloning and characterization of the gene encoding inorganic pyrophosphatase of Escherichia coli K-12

Escherichia coli K-12 gene ppa encoding inorganic pyrophosphatase (PPase) was cloned and sequenced. The 5' end of the ppa mRNA was identified by primer extension mapping. A typical E. coli sigma 70 promoter was identified immediately upstream of the mRNA 5' end. The structural gene of ppa contains 528 base pairs, from which a 175-amino-acid translation product, Mr 19,572, was deduced. The deduced amino acid composition perfectly fitted with that of PPase as previously determined (P. Burton, D. C. Hall, and J. Josse, J. Biol. Chem. 245:4346-4351, 1970). Furthermore, the partial amino acid sequence (residues 1 to 108) of E. coli PPase determined by S. A. Cohen (Ph.D. thesis, University of Chicago, 1978) was the same as that deduced from the nucleotide sequence. This is the first report of the cloning of a PPase gene.

Inorganic pyrophosphatase (EC 3.6.1.1; PPase) is ubiquitous in nature and plays an important role in energy metabolism, providing a thermodynamic pull for biosynthetic reactions such as protein, RNA, and DNA synthesis (22). According to Peller (33), nucleic acid syntheses would be energetically impossible in vivo if they were not coupled to the hydrolysis of pyrophosphate (PPj), catalyzed by PPase. In addition, various findings suggest that PPase might have important roles not only in the regulation of macromolecular synthesis and growth (1,9,14,21) but also in evolutionary events by affecting the accuracy by which DNA molecules are copied during chromosome duplication (2,18,28).
While the occurrence, reaction mechanism, and structural and kinetic properties of this important housekeeping enzyme have been studied extensively (5,41), the regulation of PPase has received little attention. We have for some years studied the role of PP1 and regulation of PPase in bacteria (24)(25)(26). To obtain a better insight into these topics we cloned and characterized the Escherichia coli ppa gene encoding PPase.
(A preliminary report has already been presented at the Fourth European Congress on Biotechnology (16).) MATERIALS AND METHODS Chemicals. Restriction endonucleases, T4 DNA ligase, exonuclease III (from E. coli B), and alkaline phosphatase from calf intestine were bought from Boehringer Mannheim, Mannheim, Federal Republic of West Germany. Antibiotics, egg white lysozyme, and low-gelling-temperature (LGT) agarose (type VII) were obtained from Sigma Chemical Company, St. Louis, Mo. Mung bean nuclease was from Promega Biotec, Madison, Wis. DNA polymerase I (Klenow fragment) and [35S]dATP (500 Ci/mmol) were purchased from Du Pont, Boston, Mass.
Bacterial strains, plasmids, and culture conditions. The following strains of Escherichia coli were used. K-12 is a wild-type strain from the collections of this laboratory. Strain JP5 was produced by mutagenizing E. coli K-12 with NTG (N-methyl-N'-nitro-N-nitrosoguanidine). The isolation * Corresponding author. t Present address: Department of Forensic Medicine, University of Turku, SF-20520 Turku, Finland. and characterization of this mutant will be published later, but in this connection it is important that this strain contains only about 15% of the normal PPase activity. To obtain stable transformants, we constructed a recA derivative of JP5, designated strain RT4, by general transduction with phage P1 (31) grown on E. coli JC10240 (6), obtained from the E. coli Genetic Stock Center. Strain RT4 was used as a host in the cloning.
The plasmid pOU61 was donated by Soren Molin (Technical University of Denmark, Copenhagen, Denmark) in E. coli host CSH50. This plasmid has a copy number of 1 to 2 per genome equivalent at 30°C, but at 42°C several thousand copies of the plasmid are produced in one cell (27). The plasmid encodes P-lactamase.
The strains were grown in LB broth or on LA plates (29).
Ampicillin (25 p.g/ml) or tetracycline (15 p.g/ml) was added when required. Growth was followed by measuring the light scattering of the cultures at 550 nm.
DNA isolation and cloning methods. E. coli CSH50 was cultured at 30°C to an A550 of 0.5. Then the cultivation was continued at 42°C for 2 h to amplify the plasmid pOU61. The cells were collected by centrifugation and lysed by lysozyme and alkali, and the plasmid pOU61 was isolated and purified by centrifugation to equilibrium in cesium chloride-ethidium bromide density gradients according to Maniatis et al. (29). Chromosomal DNA was isolated from the wild-type E. coli K-12 cells as described by Silhavy et al. (39).
Restriction endonuclease digestions of the plasmid and chromosomal DNA were performed as suggested by the suppliers. Other DNA manipulations, including isolation of restriction fragments from 0.7% LGT agarose gel, dephosphorylation of the linearized vector with calf intestinal phosphatase, and ligations, were carried out as described previously (29). E. coli RT4 cells were made competent by the calcium chloride procedure and transformed as described by Maniatis et al. (29). The transformants were grown on LA plates which contained ampicillin (25 pLg/ml) by incubating at 30°C for 30 h. The colonies were screened by the magnesium pyrophosphate overlay test (23). RNA methods. RNA was isolated as described by Igo and Losick (19). The 5' end of the ppa transcript was mapped by using the primer extension technique (30). a K-12 is the wild-type strain used as a donor and RT4 (ppa recA) was used as a recipient strain in the cloning experiment. RT4(pEV1) is the primary transformant from the cloning experiment containing the plasmid pEV1, which is a derivative of pOU61 with a 4-kb DNA fragment inserted into the BamHI site. RT4(pTP1) contains the plasmid pTP1, which is a deletant of pEV1 with a 1.3-kb DNA fragment inserted into the BamHI site of pOU61.
Owing to the amplification of the pOU61 derivatives at 42°C (27), the specific activities were measured from the cells grown at 30°C and for 2 h at 42°C, respectively.
Enzyme assay. PPase activity of the cell samples was determined by the method of Heinonen and Lahti (15).
DNA sequencing. Sequence analysis was done by the dideoxy chain termination method (36). For the sequence determination, an overlapping series of deletions were prepared for both orientations of the insert DNA in M13mpl8 by unidirectional digestion with exonuclease III (17).
Computer analysis. Searches for open reading frames and terminator regions of ppa transcription, derivation of the amino acid sequence from the nucleotide sequence, and studies on codon usage were carried out by the Sequence Analysis Software Package supplied by the University of Wisconsin Genetics Computer Group (8).

RESULTS
Cloning strategy. Our strategy for the cloning of gene ppa encoding E. coli PPase was as follows. The chromosomal DNA isolated from E. coli K-12 was partially digested by Sau3A1. The digest was fractionated by gel electrophoresis in LGT agarose, and DNA fragments of 2 to 9 kb were extracted from the melted gel as described by Maniatis et al. (29). This fragment pool was ligated into the BamHI site of the vector pOU61. The plasmids were transformed into strain RT4 (ppa recA), and ampicillin-resistant transformants were screened for high PPase activity by the magnesium pyrophosphate overlay test (23).
Out of the 4,500 transformants obtained, 1 had 35-foldhigher PPase activity than the recipient strain, RT4. This B7n N A X T DI B2 11 1I11 L BE P I I clone, designated RT4(pEV1), had fivefold-increased PPase activity compared with the wild-type E. coli K-12 (Table 1). PPase activity of this clone was further increased sevenfold when the strain was grown for 2 h at 42°C. When the plasmid pEV1 isolated from this clone was retransformed into E. coli RT4, colonies with high PPase activity were obtained. Hence, it was evident that the plasmid pEV1 contained the ppa gene.
Subcloning. The plasmid pEV1 contains a 4-kb DNA fragment, inserted into the BamHI site of pOU61. Because E. coli PPase was known to consist of six identical subunits with a molecular weight of approximately 20,000 (43), the length of the structural gene was estimated to be about 500 to 600 bp. Hence, a DNA fragment of 1 to 2 kb was thought to be long enough to contain the whole ppa gene with its possible regulatory regions. For this region, the 4-kb fragment encoding PPase was digested partially by Sau3Al restriction enzyme. Fragments 1 to 2 kb long were isolated from LGT agarose gel, ligated into the BamHI site of pOU61, and transformed into the RT4 strain. Screening of the transformants indicated several colonies with high PPase activity. For further studies we selected one clone, RT4(pTP1) ( Table 1).
Plasmid pTP1 contains a 1.3 kb DNA fragment inserted into the BamHI site of pOU61. When this fragment was religated into the BamHI site of pOU61 and retransformed into RT4, colonies with high PPase activities were again obtained, whereas RT4 transformed with pOU61 produced only colonies with a low level of PPase activity. Hence, it is evident that the plasmid pTP1 contains the intact E. coli gene encoding inorganic pyrophosphatase. To our knowledge this is the first report of cloning of any PPase gene.
Nucleotide sequence of the gene encoding E. coli inorganic pyrophosphatase. For the sequence determination, the BamHI fragment of 1.3 kb encoding E. coli PPase was transferred from the plasmid pTP1 into the BamHI site of the replicative form of the M13mpl8 vector. Exonuclease III was used to create a series of controlled unidirectional deletions of the 1.3-kb DNA fragment for both orientations in M13mpl8. A partial restriction map and the strategy of nucleotide sequence analysis are depicted in Fig. 1. The entire sequence was determined for both strands from the overlapping exonuclease III deletants of the 1.3-kb fragment. The data in Fig. 2 show 291 nucleotides (nt) of the upstream flanking sequence, a 528-nt coding region, and 375 nt of the 3'-flanking sequence. The nucleotide sequence is numbered by designating the transcription start point as nt + 1.
The 5' end of ppa mRNA was determined by the primer extension method (30 Nucleotide and deduced amino acid sequence of E. coli ppa. The nucleotide sequence is numbered by designating the likely transcription initiation site as nt +1. The seryl residue is the first amino acid in the mature translational product, and thus it is numbered as amino acid +1. The sequence landmarks are shown by asterisks. These include the -35 and -10 sequences typical of E. coli promoters, Shine-Dalgarno (SD) sequence, and the sequences shared by many E. coli rho-independent terminators. The dyad symmetries in the predicted terminator (nt 580 to 630) are indicated by arrows.
(5'-TACCCGCAGGGACGTTGAG-3'), complementary to the primer identified the 5' end of the ppa mRNA that is the nt 44 to 62 of ppa mRNA (Fig. 2), was annealed to total E. likely transcription initiation site of the ppa gene (Fig. 3). coli RNA isolated from RT4(pTP1) and used to prime avian The sequence upstream of the open reading frame encodmyeloblastosis virus reverse transcriptase. Alignment of the ing E. coli PPase contained an AAGACA at nt -35 and a cDNA product with the sequence of the ppa gene by using TATAAT at nt -13 separated by 16 nt (Fig. 2). This closely the termination method (36) with the same oligonucleotide as matches the sequences and spacing of the elements that are FIG. 3. Primer extension mapping of the ppa mRNA 5' termini. Reverse transcriptase and 5'-32P-labeled 19-mer oligonucleotide complementary to nt 44 to 62 of ppa mRNA (Fig. 2) were added to total E. coli RNA isolated from RT4(pTP1). The cDNA product was electrophoresed through a polyacrylamide-urea sequencing gel (lane 1). Lanes G-C contain the products of Sanger sequencing reactions, using the M13 clone ofppa as the template and the same primer. The sequence from the sequencing ladder is shown, together with the complementary strand on which is marked the likely transcription initiation site.
The rho-independent termination sequence of the ppa gene (at nt 580 to 630) predicted by the TERMINATOR program supplied by the University of Wisconsin Genetics Computer group (8) is shown in Fig. 2. It includes a CGGGC region, a dyad symmetry, and a stretch of thymine residues (Fig. 2) that are shared by many E. coli terminator sequences (3). However, crucial signals for transcription termination reside primarily in the RNA structure (34). Transcript of this terminator could form the stem-and-loop structure shown in Fig. 4. This structure contains a stem of 11 bp, and it is followed by a stretch of U residues as is typical in rhoindependent terminators in E. ccli (3,34). The stability of stem-and-loop structures can be expressed as AGO of the formation of the structure. On the basis of the values given by Freier et al. (10), we calculated the AGO for the putative terminator of the ppa gene as -10.1 kcal/mol. This is in the range (-8.0 to -17.9 kcal/mol) of the AGO values calculated by us for several terminators of E. ccli and its phages on the U 600 Codon usage. Choice among synonymous codons is distinctly nonrandom in the ppa gene. It predominantly uses the codons ("optimal codons") generally preferred by highly expressed E. coli genes ( Table 2). This is compatible with the fact that E. coli PPase is an abundant, housekeeping enzyme, the cellular content of which is approximately 1,000 molecules per genome (20; for comparison, see reference 11). -Derived amino acid sequence. The sequence shown in Fig.  2 contains an open reading frame of 528 nt, corresponding to the inorganic pyrophosphatase primary translation product. Serine is the amino-terminal amino acid residue in the mature E. coli PPase (4; S. A. Cohen, Ph.D. thesis, University of Chicago, 1978). Hence, Met shown in Fig. 2 is removed by an amidopeptidase, creating the new terminus of the seryl residue that was originally the second in the chain. The mature translation product contains 175 amino acid residues (Fig. 2), and it has a calculated molecular weight of 19,572. This is in good agreement with the subunit size of 20,000 determined for the purified E. coli PPase by Wong et al. (43). Amino acid composition of E. coli PPase in our  Gly  GGG   0  Trp  TGG  2  Gly  GGA  0  End  TGA  0  Gly  GGT   6  Cys  TGT  0   Gly  GGC   3  Cys  TGC  2  Glu  GAG  4  End  TAG  0  Glu  GAA  11  End  TAA  0  Asp  GAT  5  Tyr  TAT  1  Asp  GAC  9  Tyr  TAC  7  Val  GTG  2  Leu  TTG  0  Val  GTA  1  Leu  TTA Table 3). Burton et al. (4) determined the aminoand carboxyterminal amino acids of E. coli PPase with the result NH2-Ser-Leu(Ile)-Leu(Ile).. .Ala-Lys-COOH (the paper chromatographic systems employed in that study did not discriminate clearly between the phenylthiohydantoins of leucine and isoleucine). This is compatible with our sequence with the exception that the next-to-last amino acid in the carboxy terminus is asparagine instead of alanine (Fig.  2). Cohen (thesis) partially determined the amino acid sequence of E. coli PPase (residues 1 to 86, and a peptide containing 23 residues). In our work these two separate fragments were shown to form a continuous peptide covering the first 108 amino acids of the enzyme. The partial sequence presented by Cohen is the same as the aminoterminal sequence shown in Fig. 2 with the following exceptions: Glx-35 presented by Cohen is Glu-35; Glx-80 is Gln-80; Lys-81 is Pro-81; Gln-83 is Ser-83; and His-88 is Arg-88 (Fig. 2). The good agreement between the determined amino acid sequence of PPase and that predicted by our DNA sequence determination proves that our clone contains the structural gene of E. coli PPase. DISCUSSION E. coli PPase is synthesized constitutively. Thus, the composition of the cell culture medium generally exerts no specific effects on the level of the enzyme (20). However, the production of E. coli PPase can be stimulated 1.5to 3-fold by partial inhibition of DNA synthesis (14). Possibly, inhibition of DNA synthesis affects PPase synthesis indirectly, by disturbing the nucleotide metabolism, for example (24). The rate of synthesis of a constitutive protein is set genetically by the natural efficiency of the promoter of its gene, the rate at which ribosomes read the messenger, and the sensitivity of its messenger to degradation by nucleases (42). In some cases constitutive synthesis is a result of autogenous regulation (32,37). By the in vitro transcription-translation of pTP1, we will see whether autogenous regulation also has some role in the production of E. coli PPase. As mentioned above, the sequences and spacing of the elements in the 5'-flanking region of ppa closely match the consensus sequences of E. coli genes. Furthermore, the codon usage is clearly biased in ppa (Table 2). Hence, ppa seems to be both transcribed and translated efficiently in E. coli. This is compatible with the fact that PPase constitutes about 0.2% of the total soluble protein in this bacterium (20). Mutants having an altered structure or regulation of PPase would be most valuable in the elucidation of the role of PPase in metabolism. However, very little has been published concerning such mutants. Josse and Wong (20) checked at random 5,000 colonies after mutagenesis and they found three strains with diminished PPase activity. One of them contained only 2% of the hormal PPase activity, but the mutant grew as fast as the wild type. We have also put a considerable amount of time and effort into experiments aimed at isolation of PPase mutants of E. coli K-12 after mutagenesis in vivo. We have not found any strain completely devoid of PPase activity, but some strains with slightly higher or lower levels of PPase activity compared with that of the wild type have been obtained (24). One of these mutants (RT4) with low PPase activity was used as a host in the cloning of E. coli ppa as described in this paper. Now that we have the ppa gene cloned, we can try to produce a null mutant by replacing the wild-type gene present in the genome with a nonfunctional mutant constructed in vitro by the method described by Gutterson and Koshland (12), for example. However, we believe that a ppa null mutant is not viable, and hence it cannot be isolated. In VOL. 170, 1988 the case of essential enzymes, temperature-sensitive mutants are generally produced to circumvent this problem. We have made several attempts to isolate such PPase mutants with no success (24). One reason for our failure may be the unusually high thermostability of E. coli PPase (20).
As we planned the strategy for cloning the E. coli ppa gene, we thought that expression of high levels of PPase might be harmful or toxic to the cell. However, this is not the case. The strains listed in Table 2 all grew equally well in a rich medium, even at 42°C for a few hours, although the expressed levels of PPase ranged from about 6 times lower to 60 times higher than that of the wild type (K-12). After 2 h of growth at 42°C, the cells gradually started to die owing to the extensive amplification of the plasmid (27). Even then there was no difference in growth between the RT4 strain containing the pOU61 plasmid or its derivative pTP1 that contains the ppa gene.
It is logical to expect that the intracellular PP1 concentration would be dependent on the level of PPase. We have studied the relationship between PPi and PPase in different growth conditions and also in conditions where the production of PPase is stimulated (24). No clear correlation has been found so far between the intracellular PP1 concentration and the level of PPase (24). However, this project is still in progress, and it will be interesting to see what the PP1 content is in the strains listed in Table 1, for example.