Characterization of the spoIVB and recN loci of Bacillus subtilis

Two independent genes, recN and spoIVB, along with their respective promoter and termination regions, were discovered and sequenced in the 3.4-kilobase region between the ahrC and spoOA genes at map position 216 in the Bacillus subtilis chromosome map. The gene encoding a 576-amino-acid protein, which maintains a high homology with the Escherichia coli recN gene product, was adjacent to ahrC. The sequence revealed a 64,472-dalton polypeptide which contained a conserved ATP-binding site and possible lexA-type regulatory binding sequences in its promoter region. A second open reading frame identified as the spoIVB gene was directly downstream of recN. It consisted of 1,275 nucleotides which coded for a 425-amino-acid polypeptide with a molecular weight of 45,976. Phenotypic, genetic, and transcriptional analyses confirmed that this gene was spoIVB. Although no chloroform-resistant spores were produced by spoIVB-inactivated strains, under microscopic examination, phase-gray forespores were visible. The spoIVB165 mutation was localized to a 200-base-pair region in the amino-terminal portion of the polypeptide, spoIVB was not transcribed until hour 2 of sporulation in wild-type B. subtilis cells, as determined by beta-galactosidase activity assays from lacZ transcriptional fusion constructions. We found no amino acid sequence homology between the spoIVB gene product and other known bacterial proteins.

Sporulation in Bacillus subtilis may be interrupted genetically by mutations in a large number of genes (10,12). These genes are defined phenotypically by the stage of sporulation at which they stop when mutationally inactivated. Although many sporulation genes have been defined and mapped, only a few have been assigned an enzymatic or regulatory function. In the case of those genes in which a mutation gives rise to a stage 0 phenotype, we now know that two of the genes, spoOF and spoOA, code for proteins with homology to the regulator components of two-component regulatory systems (7,19,23). It is generally agreed that the regulator component of these systems is phosphorylated by the sensor component, which acts as a kinase after stimulation by its specific effector molecule (15). These two components are usually the products of linked, coregulated chromosomal genes. The spoOA and spoOF genes differed from this pattern in that no sensor component gene was found genetically linked to either locus. Sequencing for a substantial distance upstream and downstream of the spoOF gene did not reveal a possible kinase gene for SpoOF (23).
Very little of the sequence surrounding the spoOA locus has been reported (7). It seemed possible that a gene for a sensor component acting on the SpoOA protein could be located in the unsequenced region surrounding it. We report here the sequence upstream of the spoOA gene that contains a possible recN gene and the spoIVB locus.
MATERIALS AND METHODS Bacterial strains, growth conditions, and transformation. B. subtilis strains used in this study are shown in Table 1. Selection for Ermr was as previously reported by Youngman et al. (25). B. subtilis strains were transformed by the method of Anagnostopoulos and Spizizen (1). Plasmid DNA (1 to 2 ,ug) was used in each transformation. Cmr selection was on Schaeffer medium (22) supplemented with 5 ,ug of chloramphenicol per ml.
Sporulation efficiency was assayed by growing the strains in 5 ml of Schaeffer medium at 37°C for 24 h. The 5-ml culture was exposed to 1 ml of CHCI3 for 30 min to kill any vegetative cells present (9). Serial dilutions were then plated onto Schaeffer plates, and the colonies were counted as an indication of the number of viable spores present in the original 5-ml culture.
Escherichia coli DH5a competent cells (Bethesda Research Laboratories, Inc.) were used for plasmid construction and propagation. E. coli cultures were grown in Luria-Bertani medium supplemented with 100 ,ug of ampicillin per ml. DNA manipulations. Plasmid DNA was prepared by the method of Bimboim and Doly (2). Rapid plasmid DNA preparation was accomplished by the method of Holmes and Quigley (11). The plasmids used in this study were pJH1408 (7) and pJM102 and pJM103 (M. Perego and J. A. Hoch, unpublished data); the last two were used as vectors in the cloning and sequencing of recN, spoIVB, and pJM783, an integrative lacZ transcriptional fusion vector (Perego and Hoch, unpublished data).
Exonuclease III digestion of plasmid pJB2001 was carried out as described by the manufacturer of the enzyme, Boehringer Mannheim Biochemicals. Plasmid pJB2001 was cut with SmaI and SstI (two sites unique to the multiple cloning region of plasmid pJB2001), which resulted in unidirectional digestion of the inserted fragment from the SmaI site. Digestion of plasmid pJB2001 by Exonuclease III for 90 s or 2 min at 37°C produced plasmids pJB2018 and pJB2019, respectively.
Sequence analysis. The sequence analysis of both strands was accomplished by the supercoil sequencing method of Chen and Seeburg (3). The dideoxy chain termination and elongation reactions were performed by the method of Sanger et al. (20a) Piggot. ing and reverse-sequencing primers from New England BioLabs, Inc., were used, as were the oligonucleotides supplied by the Scripps Clinic and Research Foundation Core Lab, spoOA 5'-CTTGCTACATGTTTACA-3' and spoIVB 5'-GCGGTTTGCATAAACCT-3'. The spoIVB primer was also used in the determination of the transcriptional initiation start site.
Transcriptional initiation site determination. The mRNA isolation and primer extension experiments were accomplished as described previously by Perego et al. (17) with a few modifications. mRNA from strain JH642 was isolated at four different times during sporulation, To, T1, T2, and T3.
The primer extension analysis utilized the spoIVB number 1 17-mer primer for the reverse transcriptase reaction.
Approximately 80 ,ug of RNA isolated at To, T1, T2, and T3 was used in each reverse transcription reaction. Reverse transcriptase (0.5 U/,ul; Life Sciences, Inc.) was added to each reaction, and reaction mixtures were incubated at 37°C for 1 h. The reverse transcripts were then run on a 6% (24:1 cross-link) polyacrylamide-8 M urea sequencing gel next to the sequence of the promoter region to determine the actual start site of transcription. The sequencing reactions were carried out with the same primer as was used in the reverse transcription reaction.
3-Galactosidase assay. The B. subtilis strains carrying the integrated spoIVB promoter-lacZ fusion were assayed for their P-galactosidase activities as described previously (6). Activity was measured in Miller units (14).

RESULTS
Sequencing studies. The original Charon 4A bacteriophage containing the spoOA locus consisted of several EcoRI fragments, of which a 5.3-kilobase fragment transformed for spoOA, ahrC, and strC mutations (7). This fragment has recently been shown to be 6.2 kilobases, and a section of it encoding the ahrC gene has been sequenced (16). We determined the complete sequence of the region between the ahrC and spoOA genes of the 6.2-kilobase EcoRI fragment in our search for kinase genes with specificity for SpoOA. The sequencing strategy is shown in Fig. 1, and the complete sequence is shown in Fig. 2. The amino terminus of recN agreed with that of North et al. (16) and was not sequenced on both strands. Two complete open reading frames could be found in the region between the ahrC and spoOA genes.
Characterization of recN gene. An open reading frame lying just downstream of the ahrC gene was identified as a possible recN gene by a computer homology search. The putative recN gene encoded a 576-amino-acid polypeptide with a deduced molecular weight of 64,472. Similarities between the entire sequences of the potentially identical E. coli and B. subtilis proteins were consistent with the hypothesis that these proteins are highly related and probably identical (Fig. 3). The greatest conservation of amino acids observed between the two proteins occured at the aminoterminal portion of the protein and in a region towards the carboxy terminus. The amino terminus of this protein contained a potential ATP-binding site, as expected for a RecN protein (Fig. 3).
The promoter regions of the E. coli recN, lexA, and recA genes possess consensus lexA box sequences that serve as regulatory binding sites for the LexA protein (20). We observed what could be similar lexA-type consensus sequences in the upstream regions of the B. subtilis recN gene ( Table 2)  ). Sequences were aligned by using the CLUSTAL 4 program of Higgins and Sharp (8). Symbols: *, identical residues; , conserved residues. molecular weight of 45,975. Studies were undertaken to determine whether this potential protein corresponds to the product of the spoIVB gene (18) known to be in this region of the chromosome (5). Plasmids were constructed with various restriction fragments from this region in the integrative vectors pJM102 or pJM103 (Fig. 1) and were used to transform strains SL765 and 165.1 containing the spolVB165 mutation. Plasmid pJB2001 was capable of transforming the spoIVB165 mutation to prototrophy. The gene in which a mutation gives rise to this stage IV phenotype is therefore the spoOA gene, the recN gene, or the unknown open reading frame. The genetic analysis was continued by transforming plasmids pJB2001, pJB2015, pJB2018, pJB2019, pJB2023, and pJB2025 into strains SL765 and 165 with selection for Cmr. If the wild-type allele of the spoIVB165 mutation resided in the donor plasmid, Spo+ and Spo-tranformants would occur after a Campbell-type recombination event. We obtained Spo+ and Spo-colonies after transformation with plasmids pJB2001, pJB2015, pJB2018, pJB2023, and pJB2025. Plasmid pJB2019 as donor gave only Spotransformants. It was therefore concluded that the spoIVB165 mutation must be in the 200-base-pair region common to plasmids pJB2001, pJB2015, pJB2018, pJB2023, and pJB2025 and not carried by pJB2019. This 200-base-pair region is wholly contained within the amino-terminal end of the polypeptide encoded by the unassigned open reading frame, suggesting that it is the spoIVB locus.
In order to further prove that this open reading frame encoded the spoIVB gene, inactivation studies were carried out by inserting the ermG gene into the open reading frame at two locations. The first insertional inactivation was made by placing the ermG gene into the unique BalI restriction site in plasmid pJB2001, which resulted in plasmid pJB2002 (Fig.  1). The same erythromycin gene was also inserted into the downstream ClaI restriction site in plasmid pJB2001, producing plasmid pJB2004 (Fig. 1). (Note that the upstream ClaI restriction site carried on plasmid pJB2001 is not cut by ClaI because of DNA methylation protection.) Both plasmids, pJB2002 and pJB2004, were linearized and transformed into strain JH642 with selection for erythromycin resistance, giving rise to strains JH12719 and JH12720. The Ermr transformants obtained from each transformation were the result of a double crossover event as determined by Southern blot analysis (data not shown).
Both strains carrying the insertional inactivations displayed the same phenotypic differences from the parental strain JH642. The mutant phenotype was that of a late-stage sporulation-defective strain as determined by observations of colonies on Schaeffer sporulation medium and by microscopic examination. Under the microscope, some phasegray forespores were evident after 48 h of growth at 37°C on Schaeffer sporulation medium. No phase-bright spores were observed in these strains. We also carried out a quantitative analysis of the sporulation efficiency in spoIVB165 strains and our insertionally inactivated spoIVB mutant strains JH12719 and JH12720. Compared with the control strain, JH642, all spoIVB strains evaluated produce no viable chloroform-resistant spores (Table 3). spoIVB transcriptional initiation site determination. The recN gene was followed by an inverted-repeat structure  reminiscent of a terminator, which suggested that the spoIVB gene was not cotranscribed with recN. In order to determine the start site or sites of the spoIVB gene transcription, primer extension experiments using the oligonucleotide shown in Fig. 2 were performed on mRNA from strain JH642. mRNA was isolated from this strain at To, T1, T2, and Ti3 of sporulation. The precise location of the start site of transcription from each transcript was determined by running the reverse transcript next to a sequence of the promoter region with the same oligonucleotide primer. Two reverse transcripts were obtained (Fig. 4). Although both of these transcripts first appeared at T'2, only the smaller transcript remained visible by T'3.
Transcriptional analysis of spoIVB. In order to verify the point in the cell life cycle at which the spoIVB gene was transcribed, we constructed a lacZ transcriptional fusion with spoIVII. This was accomplished with the integrative lacZ transcriptional fusion vector pJM783. The 450-basepair Styl-Sspl fragment containing the spoIVB promoter was inserted into the unique SmnaI site adjacent to lacZ of pJM783. The resulting junctions were sequenced to ensure that the correct orientation of the spoIVB promoter with respect to the lacZ gene was obtained. A correctly constructed plasmid, pJB2026, was transformed into strain JH642 with selection for CMr. The resulting Cmr colonies were light blue after 48 h at 370C on Schaeffer sporulation medium containing 5-bromo-4chloro-3-indolyl-p-D-galactopyranoside (40 pkg/ml). This strain, JH12713, was assayed for 13-galactosidase production as a function of growth and sporulation. The results are shown in Fig. 5. The spoIVB gene was not transcribed until T2, and transcription continued until at least T'4.

DISCUSSION
Sequencing studies of the region of the chromosome between the ahrC and spoOA genes revealed the presence of two open reading frames of substantial size. The first of these coded for a 576-amino-acid protein with high homology to the recN gene product of E. coli. The highest homology was observed in the aminoand carboxyl-terminal portions of the proteins. The amino-terminal region also contained the GXXXXGK sequence characteristic of ATP-binding  (20). The translation start site deduced from the sequence of the recN gene of E. coli was in some doubt, as two possible Met start codons were observed in the open reading frame (20). Our data suggest that the upstream start codon is the correct one, since the deduced B. subtilis protein was highly homologous to the deduced E. coli protein in the region between the two possible start codons. Furthermore, the downstream Met codon was not conserved in B. subtilis. The high homology and similar sizes of the two proteins are very suggestive that the recN equivalent in B. subtilis has been identified. However, no inactivation studies have been undertaken to prove this notion.
The recN gene of E. coli is under SOS control and regulated directly by LexA (24). The promoter for recN contains LexA-binding sites consistent with its control pattern (20). The B. subtilis recN gene has LexA-like boxes upstream of the coding region, and a system similar to SOS control has been observed in B. subtilis (13). Interestingly, the LexA-like boxes overlap the C-terminal reading frame of the ahrC gene, and no obvious inverted-repeat terminator structure is present. It seems possible that ahrB and recN are cotranscribed and that under SOS stress conditions, recN can be differentially expressed. No experiments were undertaken to identify the transcription start site of B. subtilis recN, so it is not known whether the LexA-like boxes are near the promoter for this gene. Two sets of experiments identified the spoIVB gene. Transformation of the spoIVB165 mutation by integrative plasmids carrying various restriction fragments located the mutation to within 200 base pairs in an open reading frame. Insertional inactivation of this open reading frame at two positions by an erm gene yielded Spotransformants with the same characteristics as the strain carrying the spoIVB165 mutation. The strains bearing the spoIVB165 mutation and the erm-induced strains differ from the original description of the spoIVB mutant (4) in that they are not oligosporogenous (at least when tested by chloroform). We never found survivors to chloroform by using these strains under our cultural conditions. It seems possible that the original strain, P7, was partially suppressed for its sporulation defect, since the mutation gives complete sporulation deficiency when backcrossed to the sporulating parent of SL765. The classification of this mutation as one giving rise to a stage IV defect depends on the original classification of the P7 mutant, however.
Expression of the spoIVB locus occurs during sporulation, as determined from both mRNA isolation and lacZ fusion studies. mRNA for this locus first appeared at T2 of sporulation, when two transcripts were observed. The start sites are 22 base pairs apart in the region between recN and spoIVB. Neither promoter defined from the start sites had unequivocal homology to known promoter sequences, so it was not possible to assign a sigma factor responsible for this transcription. The upstream transcript was gone by T3. A lacZ transcription fusion to this promoter region produced p-galactosidase beginning at T2 and continuing to at least T4, which correlates directly with the mRNA studies. This timing of spoIVB expression is more characteristic of genes in which mutations cause a block earlier than stage IV of sporulation.
The spoIVB locus codes for a protein of 425 amino acids, with a molecular weight of 45,975. The protein is somewhat basic in that it has a calculated isoelectric point of 9.20. The function of the protein remains a mystery, and homology searches of sequenced proteins in the GenBank database did not reveal any homologies of significance. One-half of the first 20 residues at the amino terminus are hydrophobic, and this region lacks negatively charged residues. These properties are characteristic of signal sequences for secretion, or they might indicate the possibility of membrane association. The sequence I-K-V-T-G-K-K-S-G-E-S-E-L-V-Y beginning at codon 72 is predicted to form a helix-turn-helix structure (21), and it has some homology to other known regulatory proteins that use this structure for DNA binding. It seems fruitless to speculate further on function from primary sequence data without more knowledge of the properties of the protein. It will be of interest to see if this locus codes for a regulatory protein.