ABSTRACT
The glpD (MSMEG_6761) gene encoding glycerol-3-phosphate dehydrogenase was shown to be crucial for M. smegmatis to utilize glycerol as the sole carbon source. The glpD gene likely forms the glpFKD operon together with glpF and glpK, encoding a glycerol facilitator and glycerol kinase, respectively. The gylR (MSMEG_6757) gene, whose product belongs to the IclR family of transcriptional regulators, was identified 182 bp upstream of glpF. It was demonstrated that GylR serves as a transcriptional activator and is involved in the induction of glpFKD expression in the presence of glycerol. Three GylR-binding sites with the consensus sequence (GKTCGRC-N3-GYCGAMC) were identified in the upstream region of glpF by DNase I footprinting analysis. The presence of glycerol-3-phosphate was shown to decrease the binding affinity of GylR to the glpF upstream region with changes in the quaternary structure of GylR from tetramer to dimer. Besides GylR, cAMP receptor protein (Crp) and an alternative sigma factor, SigF, are also implicated in the regulation of glpFKD expression. Crp functions as a repressor, while SigF induces expression of glpFKD under energy-limiting conditions. In conclusion, we suggest here that the glpFKD operon is under the tripartite control of GylR, SigF, and Crp, which enables M. smegmatis to integrate the availability of glycerol, cellular energy state, and cellular levels of cAMP to exquisitely control expression of the glpFKD operon involved in glycerol metabolism.
IMPORTANCE Using genetic approaches, we first revealed that glycerol is catabolized through the glycolytic pathway after conversion to dihydroxyacetone phosphate in two sequential reactions catalyzed by glycerol kinase (GlpK) and flavin adenine dinucleotide (FAD)-containing glycerol-3-phosphate dehydrogenase (GlpD) in M. smegmatis. Our study also revealed that in addition to the GylR transcriptional activator that mediates the induction of the glpFKD operon by glycerol, the operon is regulated by SigF and Crp, which reflect the cellular energy state and cAMP level, respectively.
INTRODUCTION
Glycerol is aerobically catabolized in bacteria through two metabolic pathways (1). One includes two consecutive reactions catalyzed by glycerol kinase (GlpK) and either glycerol-3-phosphate (G3P) dehydrogenase or G3P oxidase. In this pathway, glycerol is first converted to G3P by glycerol kinase at the expense of ATP, and G3P is catabolized through the Embden-Meyerhof-Parnas pathway following its conversion to dihydroxyacetone phosphate (DHAP) by G3P dehydrogenase (or oxidase). In the alternative pathway, glycerol is oxidized by glycerol dehydrogenase to dihydroxyacetone (DHA), which is subsequently converted to DHAP by DHA kinase.
Mycobacterium tuberculosis and Mycobacterium smegmatis were shown to grow faster and to achieve a higher biomass on glycerol than on glucose (2–5). Glycerol catabolism was shown to be unaffected by the presence of another carbon source such as glucose in M. tuberculosis, which can simultaneously cocatabolize different carbon sources (2). Due to the lack of the gene encoding glycerol dehydrogenase and the indispensability of glycerol kinase in glycerol utilization, it was suggested that M. tuberculosis catabolizes glycerol using glycerol kinase and G3P dehydrogenase (6).
Three classes of G3P dehydrogenase are found in bacteria. The membrane-bound G3P dehydrogenase encoded by glpD is a flavoprotein that catalyzes the oxidation of G3P to DHAP with the concomitant reduction of ubiquinone or menaquinone of the respiratory electron transport chain (ETC) (7). GlpD is known to be required for dissimilation of glycerol as a carbon and energy source (8–10). Another membrane-bound, ETC-linked G3P dehydrogenase was identified in Escherichia coli and shown to be expressed under anaerobic growth conditions. This anaerobic G3P dehydrogenase is encoded by the glpABC operon (10). Unlike GlpD and anaerobic G3P dehydrogenase, the soluble G3P dehydrogenase encoded by gpsA is an NAD(P)H-linked dehydrogenase and catalyzes mainly the reduction of DHAP to G3P using NAD(P)H as a reductant. The resulting G3P is used to synthesize glycerophospholipids for membrane biogenesis, as well as triacylglycerol in mycobacteria as a storage molecule or as a sink for reducing equivalents under conditions of oxygen limitation (11, 12).
Glycerol kinase is encoded by glpK. The glpK gene usually forms an operon together with the glpF gene, which encodes a glycerol facilitator belonging to the aquaporin family of passive transporters (13). While the glpD gene is transcriptionally separated from the glpFK operon in E. coli, Yersinia pestis, Bacillus subtilis, Listeria monocytogenes, and Pseudomonas aeruginosa (14–18), it forms a transcriptional unit with other glycerol catabolic genes in the form of either the glpFKD operon in actinobacteria such as Streptomyces clavuligerus and Streptomyces coelicolor or the glpDK operon in mycoplasmas (1, 19, 20).
Expression of many glycerol catabolic genes is known to be subjected to substrate induction and catabolite repression. Induction of glycerol catabolic genes by glycerol has been shown to be mediated by the GlpR, GylR, and GlpP transcription factors in E. coli, S. coelicolor, and B. subtilis, respectively. GlpR and GylR function as DNA-binding repressors in the regulation of glycerol catabolic genes (18, 21–24), while GlpP serves as an RNA-binding antiterminator in the transcription of glycerol catabolic genes (17, 25, 26). The presence of a preferred carbon source such as glucose was shown to lead to repression of glycerol catabolic genes in E. coli, B. subtilis, and Enterococcus faecalis (27–31). Exceptionally, expression of glycerol catabolic genes in Mycoplasma pneumoniae was shown to be neither induced by glycerol nor under the control of catabolite repression (1).
Although it was demonstrated that glycerol kinase (GlpK) is essential for M. tuberculosis to grow under conditions where glycerol is supplied as the sole source of carbon (6), there has been no detailed report revealing the genes involved in glycerol catabolism and their regulation in mycobacteria. In this first report, we describe the role and regulation of the glpFKD operon involved in glycerol catabolism in M. smegmatis.
RESULTS
Identification of the genes involved in glycerol catabolism in M. smegmatis.The genome sequence of M. smegmatis mc2155 was searched to identify the genes involved in glycerol catabolism. Three consecutive open reading frames, designated MSMEG_6758, MSMEG_6759, and MSMEG_6761, were identified that showed homology to the glpF, glpK, and glpD genes, respectively, identified in other bacteria (Fig. 1). The short (23-bp) intergenic region between MSMEG_6758 and MSMEG_6759 and the 4-bp overlap of MSMEG_6759 and MSMEG_6761 presumably indicate that these three genes form a glpFKD operon in M. smegmatis similarly to the equivalent genes in S. coelicolor, which, like mycobacteria, belongs to the Actinobacteria family (20).
Genetic organization of the glpFKD locus in M. smegmatis mc2 155 and the upstream sequence of the glpFKD operon encompassing its promoter region and putative cis-acting elements involved in the regulation of the glpFKD operon. (A) The lengths of the overlapping and intergenic regions are given as the nucleotide numbers above the schematic diagram. (B) Four inverted repeats (IR1, IR2, IR3, and putative Crp-binding site) are marked by the head-facing arrows above their sequences. The putative promoter region of the glpFKD operon is boxed. The transcriptional start point (+1) of the glpFKD operon deduced from RNA deep-sequencing data from M. smegmatis is shaded in gray. The start codons of glpF and gylR are underlined, and the arrows above them indicate the transcriptional direction. The numbers to the left of the sequences indicate the positions of the leftmost nucleotides relative to the glpF gene. RBS, ribosome-binding site.
The glpK (MSMEG_6759) and glpD (MSMEG_6761) genes encode glycerol kinase and flavin adenine dinucleotide (FAD)-containing G3P dehydrogenase, respectively, while the glpF gene codes for a glycerol facilitator protein. M. smegmatis contains two additional glpK genes, MSMEG_6229 and MSMEG_6756. As judged by the values corresponding to the number of mapped reads per kilobase pair per million (RPKM) from RNA sequencing analysis on M. smegmatis grown aerobically in glucose-7H9 medium (32), MSMEG_6759 appears to be the major glycerol kinase in M. smegmatis (see Table S1 in the supplemental material). In descending order of the RPKM values, we annotated MSMEG_6759, MSMEG_6756, and MSMEG_6229 as glpK, glpK2, and glpK3, respectively. In contrast to M. smegmatis, M. tuberculosis H37Rv has a single glpK gene (Rv3696c), the product of which shows 76%, 53%, and 83% identity to the gene products of MSMEG_6759, MSMEG_6756, and MSMEG_6229, respectively. Two additional glpD genes were also identified from the genome sequence of M. smegmatis. By applying the same principle as the case of the glpK paralogous genes, we annotated MSMEG_6761, MSMEG_1736, and MSMEG_4332 as glpD, glpD2, and glpD3, respectively (Table S1). The RPKM values of the three glpD homologs indicate that glpD (MSMEG_6761) is the predominantly expressed glpD gene in M. smegmatis. M. tuberculosis possesses two glpD genes (Rv2249c and Rv3302c). Rv2249c shows the highest homology (76% sequence identity) to GlpD3, while Rv3302 exhibits the highest homology (85%) to GlpD2. In addition to three glpD genes, M. smegmatis possesses two gpsA genes (MSMEG_1140 and MSMEG_2393) encoding NADH-linked G3P dehydrogenase (Table S1).The glpF gene encodes a glycerol transporter that is a member of the major intrinsic protein (MIP) family of transporters structurally related to the aquaporin (13). There is a single copy of the glpF gene in M. smegmatis, and the M. tuberculosis genome contains no gene homologous to glpF (33).
To gain further insights into the functional roles of the glpF, glpK, and glpD genes in utilization of glycerol in M. smegmatis, null mutants of M. smegmatis carrying deletions within glpF, glpK, or glpD were constructed, and the mutant strains were assessed for their growth in medium supplemented with 0.2% (wt/vol) glucose or 0.2% (wt/vol) glycerol as the carbon source (Fig. 2). No difference between the wild-type and mutant strains was observed with respect to growth rate, when glucose was supplied as the carbon source. When glycerol was the sole carbon source, growth of the ΔglpK mutant strain and especially of the ΔglpD mutant strain was severely retarded compared to the isogenic wild-type strain, while growth of M. smegmatis was not affected by glpF mutation. Introduction of the plasmid-borne glpD gene (pNBV1::glpD) into the ΔglpD mutant restored growth of the mutant on 7H9-glycerol medium (see Fig. S1 in the supplemental material), indicating that the severe defect in glycerol utilization observed for the mutant was caused by the inactivation of the glpD gene. The observation that the ΔglpK mutant could still grow on glycerol, albeit slowly, indicates the functional redundancy of the GlpK homologs. The dispensability of the glpF gene with respect to glycerol utilization of M. smegmatis and the lack of glpF in M. tuberculosis suggest that glycerol can be transported across the membrane by simple diffusion, as suggested previously (34–36). However, on the basis of a report showing impaired growth of a glpF mutant strain of E. coli at low glycerol concentrations relative to the wild-type strain (37), we do not rule out the possibility that GlpF is beneficial to M. smegmatis under conditions of low glycerol concentrations.
Growth curves of the wild-type, ΔglpF, ΔglpK, and ΔglpD mutant strains of M. smegmatis. The ΔglpF (A), ΔglpK (B), and ΔglpD (C) mutant strains, as well as the wild-type (WT) strain as the control, were grown aerobically in 7H9 medium supplemented with 0.2% (wt/vol) glucose or 0.2% (wt/vol) glycerol as a carbon source at 37°C. All values provided are the averages of the results from three biological replicates. The error bars indicate the standard deviations.
Positive regulation of the glpFKD operon by GylR.The open reading frame designated MSMEG_6757 was identified to be located 182 bp upstream of glpF with a transcriptional orientation opposite that of the glpFKD operon (Fig. 1). Its deduced protein product consists of 253 amino acid residues with a calculated molecular mass of 26.7 kDa. Since a BLAST search showed that the gene product of MSMEG_6757 shows 48% sequence identity to the reported GylR (glycerol operon regulator) of S. coelicolor, we annotated MSMEG_6757 as gylR.
To examine the functional role of GylR in expression of the glpFKD operon in M. smegmatis, a mutant carrying a deletion within gylR (ΔgylR) was constructed, and expression of the glpFKD operon was examined in the wild-type and ΔgylR mutant strains carrying the glpF::lacZ transcriptional fusion plasmid pNCglpF (Fig. 3). The expression level of glpF was shown to be increased by 1.8-fold in the wild-type strain of M. smegmatis grown on glycerol relative to that in the same strain grown on glucose, indicating the induction of the glpFKD operon in the presence of glycerol. Expression of glpF was almost abolished in the ΔgylR mutant grown on either glucose or glycerol. Introduction of the intact gylR gene into the ΔgylR mutant in trans led to restoration of the regulation pattern and expression levels of glpF observed in the wild-type strain.
Expression of the glpFKD operon in the wild-type and ΔgylR strains of M. smegmatis. The glpF promoter activity was determined in the wild-type (WT) and ΔgylR mutant strains carrying both pMV306 and the glpF::lacZ transcriptional fusion plasmid pNCglpF. For complementation of the ΔgylR mutant, pMV306gylR (a pMV306-derived plasmid carrying the intact gylR gene and its own promoter) was introduced into the mutant (ΔgylR + gylR) in place of pMV306. The strains were grown aerobically to an OD600 of 0.45 to 0.5 in 7H9-glycerol or 7H9-glucose. Cell-free crude extracts were used to measure β-galactosidase activity. All values provided are the averages of the results from three biological replicates. The error bars indicate the standard deviations. Statistical significance was determined by two-tailed Student's t test. *, P < 0.01.
Identification of cis-acting regulatory elements upstream of the glpFKD operon.Since we previously confirmed that the transcriptional start points (TSPs) of ahpC and furA1 extrapolated from RNA deep-sequencing data of the whole transcriptome of M. smegmatis were consistent with those determined by the conventional analyses such as primer extension and S1 nuclease mapping (32), the TSP of glpF was predicted from the RNA sequencing data (Fig. 1; see also Fig. S2). The predicted TSP (+1) corresponds to the nucleotide “A” that is located 56 bp upstream of glpF. The sequence region (GGTGA-N16-GGCTTC) resembling the consensus sequence of mycobacterial SigF promoters (GGWWT-N16-17-GGGTAY) was identified upstream of the TSP (38, 39). An inverted repeat sequence (TATGA-N6-ACACA) was found to be located between the TSP and the start codon of glpF which was similar to the binding motif of cAMP receptor protein (Crp) (TGTGA-N6-TCACA) (40, 41).
In order to define the location of GylR-binding sites in the intergenic region between gylR and glpF, DNase I footprinting analysis was performed with purified GylR and 237-bp 6-carboxytetramethylrhodamine (TAMRA)-labeled DNA fragments containing the intergenic region (Fig. 4A). The GylR protein was heterologously overexpressed in E. coli and purified as the C-terminally His6-tagged form (Fig. S3). Binding of GylR protected DNA from DNase I cleavage at a position between −99 and −66 with respect to the TSP of glpF. This protected region contains two inverted repeat sequences (IR1 and IR2) that are very similar to each other (Fig. 1). Another inverted repeat sequence (IR3) which resembled IR1 and IR2 was identified 8 bp downstream of IR2. The three IR sequences share the consensus sequence (GKTCGRC-N3-GYCGAMC). The presence of GylR led to a significant increase in the intensity of two bands (denoted by asterisks in Fig. 4A). IR3 is located between these GylR-induced hypersensitive sites. The results of DNase I footprinting analysis suggest the presence of three GylR-binding sites in the gylR-glpF intergenic region and the higher binding affinity of GylR for the IR1 and IR2 sites than for the IR3 site.
Binding of GylR to the glpFKD regulatory region. (A) DNase I footprinting analysis of the glpFKD regulatory region bound by GylR. The gylR-glpF intergenic DNA fragments containing the glpF coding strand labeled with TAMRA at their 5′ ends were incubated with increasing concentrations (0, 1.5, and 3.0 nmol) of purified GylR and then subjected to DNase I footprinting reactions. The amounts of GylR protein used are given below the lanes. The regions protected by GylR and the GylR-binding sites (IR1, IR2, and IR3) are marked by the black bar and two head-facing arrows on the right, respectively. The asterisks indicate the hypersensitive sites resulting from GylR binding. Lanes G, A, T, and C represent the sequence ladders. (B) EMSA showing the binding of purified GylR to the glpF regulatory region. The mixtures of 237-bp DNA fragments (50 fmol corresponding to 7.7 ng; specific DNA) containing the regulatory region of the glpFKD operon and 120-bp DNA fragments (50 fmol [corresponding to 3.9 ng]; control DNA) without the GylR-binding site were incubated with increasing amounts of purified GylR in the absence and presence of 50 mM G3P. The concentrations of used GylR are given above the lanes. The GylR-DNA reaction mixtures were subjected to native PAGE. After electrophoresis, the gel was stained with SYBR green EMSA gel staining solution.
It has been suggested that G3P is an effector molecule for S. coelicolor GylR (21). To determine whether G3P influences the binding affinity of GylR for the intergenic region between gylR and glpF, electrophoretic mobility shift assays (EMSA) were conducted using the purified GylR protein and 237-bp DNA fragments encompassing the gylR-glpF intergenic region (specific DNA), as well as 120-bp control DNA fragments containing no GylR-binding site. As shown in Fig. 4B, more specific DNA was shifted with increasing concentrations of GylR in the presence and absence of G3P, while the control DNA was not. The binding affinity of GylR for the specific DNA fragments was reduced in the presence of G3P, indicating that G3P serves as an effector molecule for GylR. We did not observe the distinct bands of DNA-GylR complexes in the presence or absence of G3P, implying the weak binding of GylR to the intergenic region between gylR and glpF.
Since EMSA showed that the presence of G3P decreased the binding affinity of GylR to its target DNA, possible changes in the quaternary structure of purified GylR in the presence of G3P were examined by gel filtration chromatography (Fig. 5). On the basis of the elution volumes in gel filtration chromatography and the theoretical molecular mass of His6-tagged GylR monomer (27.6 kDa), the quaternary structure of native GylR in the presence and absence of G3P was estimated. In G3P-free solution, GylR exists predominantly as a homotetramer with small fractions of homodimer. As the concentration of G3P was increased in solution, the concentrations of homodimer fractions were increased with the gradual reduction in homotetramer fractions, indicating that the binding of G3P to GylR shifts the equilibrium of GylR oligomerization from tetramer to dimer.
Determination of the molecular mass of GylR by gel filtration chromatography. Elution profiles represent the purified GylR protein on Superose 12 (10/300). The elution profiles of purified GylR without treatment of G3P (control) and with treatment of 10 mM, 20 mM, and 50 mM G3P are indicated. β-Amylase (200 kDa), bovine serum albumin (BSA; 66 kDa), and carbonic anhydrase (CA; 29 kDa) were used as standard proteins.
To investigate the roles of IR1, IR2, and IR3 in the regulation of glpF expression, we determined the promoter activity of glpF in M. smegmatis strains grown on glucose or glycerol by using a series of glpF::lacZ transcriptional fusions with either 5′-nested deletions of the glpF upstream region or mutations within IR2 and IR3 (Fig. 6). Consistent with the results presented in Fig. 3, the control strain of M. smegmatis carrying pNCglpF with intact IR1, IR2, and IR3 showed induction of glpF expression in the presence of glycerol. When grown on glucose, the M. smegmatis strain harboring pNCglpFΔIR1 with IR1 deletion showed a slight increase in glpF expression compared to the control strain with pNCglpF grown under the same condition. In contrast, the expression level of glpF in the M. smegmatis strain with pNCglpFΔIR1 was lower than that in the control strain with pNCglpF when both strains were grown on glycerol. The sum of the effects of IR1 deletion corresponded to no induction of glpF expression by glycerol. Expression of glpF was abolished in the M. smegmatis strain harboring either pNCglpFΔIR12 with deletions of IR1 and IR2 or pNCglpFΔIR123 with deletions of IR1, IR2, and IR3, regardless of the presence or absence of glycerol. As observed for the M. smegmatis strain with pNCglpFM3, the introduction of two point mutations into IR3 (GTTCGGC-N3-GCCGAAC to GTTATGC-N3-GCCGAAC) abrogated expression of glpF. Induction of glpF by glycerol still occurred in the M. smegmatis strain carrying pNCglpFM2 on which the IR2 sequence was mutated at four nucleotides conserved in all three IRs (GGTCGGC-N3-GTCGACC to GGTATGC-N3-GTATACC), although expression of glpF was drastically reduced (the underlined nucleotides represent the transversion mutations used for site-directed mutagenesis). Altogether, the results presented in Fig. 6 suggest the following conclusions. (i) Among the three GylR-binding sites, IR3 is most important for glpF expression. The GylR tetramer or dimer bound at IR3 (located between positions −41 and −57 relative to the TSP of glpF) is assumed to promote the binding of RNA polymerase to the promoter. (ii) From the finding that mutations of IR2 alone or deletion of both IR1 and IR2 led to almost complete abolishment of glpF expression, it is evident that binding of the GylR dimer or tetramer to IR3 requires the presence of IR2. Mutations in IR2 might abrogate the cooperative binding of two GylR dimers and the binding of a GylR tetramer to IR2 and IR3. (iii) Abolishment of glycerol-mediated glpF induction by IR1 deletion indicates that, although the presence of IR1 is not essential for glpF expression, IR1 is necessary for the induction of glpF expression by glycerol. In the presence of glycerol in growth medium, deletion of IR1 is expected to weaken the binding of the GylR dimer to the IR3 site as a result of the lack of the cooperative binding of two GylR dimers to IR1 and IR2. In the absence of glycerol, deletion of IR1 might instead increase the probability that the GylR tetramer would bind to the IR2 and IR3 sites, due to the lack of competition between the IR1 and IR2 sites and the IR2 and IR3 sites for binding of the GylR tetramer.
Effect of deletions and mutations in the putative GylR-binding sites (IR1, IR2, and IR3) on expression of the glpFKD operon. The glpF promoter activity was determined using the pNCglpF-derived glpF::lacZ transcriptional fusions containing serial deletions of the glpF upstream region (pNCglpFΔIR1, pNCglpFΔIR12, pNCglpFΔIR123) (A) or mutations (pNCglpFM2, pNCglpFM3) in the GylR-binding sites (B). As a control (Con), pNCglpF was included in the experiment. The schematic diagrams depicting the transcriptional fusions are presented on the right. The inverted repeats of the GylR-binding sites are indicated by the head-facing arrows. Mutations within IR2 and IR3 are indicated by the asterisks. Cells of the wild-type strain of M. smegmatis harboring the transcriptional fusion plasmids were grown aerobically to an OD600 of 0.45 to 0.5 in 7H9-glycerol or 7H9-glucose. Cell-free crude extracts were used to measure β-galactosidase activity. All values provided are the averages of the results from three biological replicates. The error bars indicate the standard deviations. Statistical significance was determined by two-tailed Student's t test. *, P < 0.05.
Negative regulation of the glpFKD operon by Crp.M. smegmatis contains two genes encoding Crp paralogs (MSMEG_0539 and MSMEG_6189). Sequence homology and biochemical analyses showed that MSMEG_6189 is the Crp protein corresponding to that found in M. tuberculosis (42). The presence of a putative Crp-binding site between the putative promoter and the start codon of glpF led us to assume that Crp is involved in the regulation of the glpFKD operon. To examine this assumption, we determined the promoter activity of glpF in the wild-type and Δcrp (ΔMSMEG_6189) mutant strains of M. smegmatis using the glpF::lacZ transcriptional fusion plasmid pMV306lacZglpF. As shown in Fig. 7, the expression level of glpF was increased by 2.2-fold in the wild-type strain of M. smegmatis grown on glycerol relative to the same strain grown on glucose, which is consistent with the result presented in Fig. 3. The expression level of glpF was increased by approximately 4-fold in the Δcrp mutant grown on glucose or glycerol compared to that determined in the wild-type strain grown on the same carbon source. Expression of glpF was induced in the Δcrp mutant by the presence of glycerol as observed in the wild-type strain. Taken together, the results in Fig. 7 indicate that glpF is under the negative control of Crp and that Crp is not involved in induction of glpF expression by glycerol.
Expression of the glpFKD operon in the wild-type and Δcrp mutant strains of M. smegmatis. The glpF promoter activity was determined using pMV306lacZglpF. The wild-type (WT) and Δcrp (ΔMSMEG_6189) mutant strains of M. smegmatis were grown aerobically to an OD600 of 0.45 to 0.5 in 7H9-glycerol or 7H9-glucose. Cell-free crude extracts were used to measure β-galactosidase activity. All values provided are the averages of the results from three biological replicates. The error bars indicate the standard deviations. Statistical significance was determined by two-tailed Student's t test. *, P < 0.01.
To determine whether Crp binds to the glpF regulatory region containing the putative Crp-binding site, EMSA was performed using purified His6-tagged Crp (MSMEG_6189) (Fig. 8). The binding of Crp to 237-bp DNA fragments encompassing the glpF regulatory region was proportional to the amount of applied Crp, with higher concentrations of the protein resulting in the presence of larger amounts of retarded protein-DNA complexes and smaller amounts of free DNA. In contrast, the control DNA fragments without the Crp-binding motif were not shifted with increasing concentrations of Crp.
EMSA showing the binding of purified Crp to the glpFKD regulatory region. Incubations of 237-bp DNA fragments containing the regulatory region of the glpFKD operon (50 fmol, corresponding to 7.7 ng; specific DNA) and 120-bp DNA fragments without the Crp-binding site (50 fmol, corresponding to 3.9 ng; control DNA) were performed with various concentrations of purified Crp (MSMEG_6189). The concentrations of Crp are given above the lanes. The Crp-DNA reaction mixtures were subjected to native PAGE, and the gel was stained with SYBR green EMSA gel staining solution.
The glpFKD operon belongs to the SigF regulon.Identification of the putative promoter, which resembles the consensus sequence of SigF-recognizing promoters, at a position upstream of glpF prompted us to examine whether transcription of the glpFKD operon depends on SigF. The expression levels of glpF were comparatively determined in the wild-type and sigF deletion mutant (ΔsigF) strains of M. smegmatis using the glpF::lacZ transcriptional fusion plasmid pNCglpF. When both the strains were grown on glucose, the promoter activity of glpF was significantly lower in the ΔsigF mutant strain than in the wild-type strain (Fig. 9A), indicating that the glpFKD operon belongs to the SigF regulon. Our comparative RNA sequencing analysis of the wild-type strain of M. smegmatis and its isogenic Δaa3 mutant strain with a deletion in ctaC encoding one of the aa3-cytochrome c oxidase subunits revealed that many genes belonging to the SigF regulon are strongly upregulated when the major terminal oxidase of the ETC in M. smegmatis is inactivated (S. Y. Song and J. I. Oh, unpublished data). Consistent with this finding, the promoter activity of glpF was shown to be 9.4-fold higher in the Δaa3 mutant strain than in the wild-type strain (Fig. 9B).
Expression of the glpFKD operon in the ΔsigF and Δaa3 mutant strains of M. smegmatis. The glpF promoter activity was determined in the ΔsigF (A) and Δaa3 (B) mutant strains of M. smegmatis using pNCglpF. The wild-type (WT) strain with pNCglpF was included in the assay as a reference for comparison. The WT and mutant strains were grown aerobically to an OD600 of 0.45 to 0.5 in 7H9-glucose medium. Cell-free crude extracts were used to measure β-galactosidase activity. All values provided are the averages of the results from three biological replicates. The error bars indicate the standard deviations. Statistical significance was determined by two-tailed Student's t test. *, P < 0.01.
DISCUSSION
Due to the large genome size (6.84 Mbp), the presence of multiple paralogous genes does not represent a rare case in M. smegmatis (32, 43). M. smegmatis has three of each of the glpK and glpD genes that are involved in glycerol catabolism. We found that when glycerol was supplied as the sole carbon source, the ΔglpD (MSMEG_6761) mutant of M. smegmatis exhibited a severe defect in growth (Fig. 2). This finding indicates that glycerol is catabolized through the GlpK-GlpD metabolic pathway in M. smegmatis and that the MSMEG_6761 product is the major G3P dehydrogenase responsible for glycerol catabolism.
Expression of the glpFKD operon in M. smegmatis was shown to be induced by glycerol, albeit not strongly. The mechanism of induction of glycerol catabolic genes by glycerol has been studied in many bacteria. For example, substrate induction of glycerol catabolic genes in E. coli and B. subtilis is mediated by the GlpR repressor and the GlpP antiterminator, respectively, for which G3P serves as the inducer (17, 18, 22–27). In S. coelicolor, it was demonstrated that GylR negatively regulates the gylCABX (glpFKDX) operon and that G3P also serves as the inducer molecule for GylR (21). Although GylR of M. smegmatis shows 48% identity to GylR of S. coelicolor, it serves as an activator for expression of the glpFKD operon in M. smegmatis, in contrast to S. coelicolor GylR, which functions as a repressor (21). The GylR regulators belonging to the IclR family of transcriptional regulators do not show sequence similarity to GlpR of E. coli, P. aeruginosa, and R. leguminosarum (15, 44, 45). Regulators of the IclR family have the helix-turn-helix motif and the effector-binding domain at their N-terminal and C-terminal domains, respectively, and typically consist of 240 to 280 residues (46). Considering the palindrome structure (GKTCGRC-N3-GYCGAMC) of the identified GylR-binding sites, the DNA-binding unit of GylR for a single palindrome sequence is presumed to be the homodimer as in other members of the IclR family (46). Purified GylR exists in the form of homotetramers in G3P-free solution and is transformed to homodimers in the presence of G3P (Fig. 5). The effector-dependent conversion of the quaternary structure between homodimer and homotetramer is a common property for the IclR family regulators (46). Given that the GylR tetramer simultaneously binds to the two neighboring binding sites, the DNase I footprinting result implies that the binding affinity of the GylR tetramer for the IR1 and IR2 sites is higher than for the IR2 and IR3 sites (Fig. 4A). The centers of IR1 and IR2 are separated by two helical turns (22 bp), while those between IR2 and IR3 are separated by 25 bp. Therefore, IR1 and IR2 are on the same phase of DNA, which might account for the higher binding affinity of the GylR tetramer for the IR1 and IR2 sites.
Based on the importance of IR3 in glpF expression and G3P-mediated changes in the GylR quaternary structure (Fig. 5 and 6), we propose a model explaining the regulation of the glpFKD operon by GylR (Fig. 10). This model is extrapolated from the assumption that the probability of IR3 occupation by GylR is higher in the presence of G3P than in the absence of G3P. The cellular level of G3P in the absence of glycerol is very low; thus, GylR proteins exist predominantly as homotetramers. A GylR tetramer binds to either the IR1 and IR2 sites or the IR2 and IR3 sites. Binding of the GylR tetramer to the IR2 and IR3 sites leads to the activation of glpF expression by recruiting RNA polymerase to the glpF promoter, while binding of the GylR tetramer to the IR1 and IR2 sites does not result in glpF expression. Considering the higher binding affinity of the GylR tetramer for IR1 and IR2 than for IR2 and IR3, the outcome is low expression of glpF in the absence of glycerol. In contrast, the presence of glycerol results in an increase in the cellular level of G3P, and GylR proteins exist as homodimers under these conditions. GylR dimers bind to the IR1, IR2, and IR3 sites in a cooperative way. The IR1 and IR2 sites separated by two helical turns appear to serve as anchoring sites for the cooperative binding of two GylR dimers, and binding of two GylR dimers to IR1 and IR2 further makes it easier for another GylR dimer to bind to IR3 cooperatively. Although the binding affinity of GylR dimers for the gylR-glpF intergenic region is lower than that of GylR tetramers (Fig. 4B), the probability of IR3 occupation by the GylR dimer is expected to be higher than that by the GylR tetramer according to this model. The occupation of IR3 by the GylR dimer recruits RNA polymerase to the glpF promoter, thereby inducing glpF expression.
Model for the regulation of the glpFKD operon by GylR. The GylR monomers are represented by gray ovals, and the GylR-binding sites (IR1, IR2, and IR3) are marked by the head-facing arrows. The numbers between the two adjacent GylR-binding sites indicate the distances between their centers. G3P is depicted by black circles. The promoter region (P) of the glpFKD operon is boxed. RNAP, RNA polymerase.
There are two functional Crp paralogs (MSMEG_6189 and MSMEG_0539) in M. smegmatis, unlike M. tuberculosis H37Rv, which has a single Crp protein (Rv3676) (42, 43). MSMEG_6189 shares 97% amino acid sequence identity with Rv3676, while MSMEG_0539 has 77% identity with Rv3676. Both Crp proteins were suggested to recognize and bind to the same consensus sequence (TGTGA-N6-TCACA), although they differ in biochemical properties such as binding affinity for cAMP and DNA and cAMP-dependent enhancement of the DNA-binding affinity (42, 43). As judged by the RPKM values of the MSMEG6189 and MSMEG_0539 genes from RNA sequencing analysis (1,248 ± 48 and 157 ± 7, respectively) (32), MSMEG_6189 appears to be the predominantly expressed Crp in M. smegmatis. The presence of a putative Crp-binding sequence between the start codon and TSP of glpF, the binding of purified MSMEG_6189 to DNA fragments containing the glpF upstream region, and the derepression of glpF expression in the Δcrp (ΔMSMEG_6189) mutant of M. smegmatis relative to the wild-type strain all indicate that the glpFKD operon is under the negative regulation of Crp in M. smegmatis.
The SigF sigma factor of M. smegmatis is a structural and functional homolog of the well-studied SigB sigma factors of Bacillus species (47). The functionality of SigF was previously suggested to be regulated posttranslationally by the so-called partner switching mechanism involving its cognate anti-sigma factor (RsbW in M. smegmatis and UsfX in M. tuberculosis) and anti-anti-sigma factors (RsfA and RsfB) in a manner similar to that seen with SigB (48–50). In B. subtilis exposed to energy-limiting conditions, genes of the SigB regulon were reported to be strongly upregulated (48). Likewise, we observed that the level of expression of the SigF regulon was significantly increased in M. smegmatis exposed to starvation conditions (7H9 medium supplemented with 0.02% [wt/vol] glucose) and in the Δaa3 mutant of M. smegmatis (Song and Oh, unpublished). The Δaa3 mutant showed both an approximately 50% decrease in aerobic respiration in comparison to the wild-type strain and slower growth (51), implying that the mutant undergoes energy limitation. The presence of glycerol induced glpF expression in the wild-type strain by merely about 2-fold, whereas expression of glpF was increased in the Δaa3 mutant by 9.3-fold compared to that in the wild-type strain, implying that the major factor determining the expression level of the glpFKD operon is not the availability of glycerol but the cellular energy state, which is affected by the availability of energy sources and the functionality of the respiratory ETC. Under nutrient-limiting or respiration-inhibitory conditions, the glpFKD operon in M. smegmatis is expected to be highly induced, which might facilitate the utilization of glycerophospholipids and triacylglycerol as carbon and energy sources. If the cellular level of cAMP, which is produced from ATP by adenylyl cyclases, is decreased under these energy-limiting conditions, relief of Crp-mediated repression of the glpFKD operon might further increase expression of the operon.
In conclusion, we found that glycerol is catabolized via the GlpK-GlpD pathway in M. smegmatis. The glpFKD operon is under the tripartite control of GylR, SigF, and Crp, which implies that the availability of glycerol, the cellular energy state, and the cellular levels of cAMP are integrated to exquisitely control expression of the glpFKD operon in M. smegmatis.
MATERIALS AND METHODS
Bacterial strains, plasmids, and culture conditions.The bacterial strains and plasmids used in this study are listed in Table S2 in the supplemental material. M. smegmatis strains were grown at 37°C in Middlebrook 7H9 medium (Difco, Sparks, MD) supplemented with 0.2% (wt/vol) glucose or 0.2% (wt/vol) glycerol as a carbon source and 0.02% (vol/vol) Tween 80 as an anticlumping agent. M. smegmatis strains were grown aerobically in a 500-ml flask filled with 100 ml of 7H9-glucose medium on a gyratory shaker (200 rpm). E. coli strains were grown in Luria-Bertani (LB) medium at 37°C. Ampicillin (100 μg/ml for E. coli), kanamycin (50 μg/ml for E. coli and 30 μg/ml for M. smegmatis), and hygromycin (200 μg/ml for E. coli and 50 μg/ml for M. smegmatis) were added to the growth medium when required. The construction of the mutant strains of M. smegmatis is described in the supplemental material. Validation of the mutant strains by PCR analysis is shown in Fig. S4.
DNA manipulation and electroporation.Standard protocols and manufacturers’ instructions were followed for recombinant DNA manipulations (52). The introduction of plasmids into M. smegmatis strains was conducted by electroporation as previously described (53). The primers used for PCR and site-directed mutagenesis are listed in Table S3.
β-Galactosidase assay and determination of the protein concentration.β-Galactosidase activity was assayed spectrophotometrically as described previously (54). Protein concentrations were determined by using a Bio-Rad protein assay kit (Bio-Rad, Hercules, CA) with bovine serum albumin as the standard protein.
Purification of the GylR and Crp proteins.E. coli strain BL21-Codonplus (DE3)-RP carrying pET29bgylR was grown aerobically at 37°C in LB medium containing 50 μg/ml kanamycin to an optical density at 600 nm (OD600) of 0.5 to 0.6. Expression of the gylR gene was induced by the addition of isopropyl-β-d-thiogalactopyranoside (IPTG) to reach a final concentration of 0.5 mM, and the cells were further grown for 4 h at 30°C. After a 350-ml culture was harvested, cells were resuspended in 10 ml of buffer A (20 mM Tris-HCl [pH 8.0] containing 100 mM NaCl) containing 10 U/ml DNase I and 10 mM MgCl2. The resuspended cells were disrupted twice using a French pressure cell, and cell-free crude extracts were obtained by centrifugation two times at 14,000 × g for 15 min. After addition of imidazole to reach a final concentration of 5 mM, 0.6 ml of 50% (vol/vol) Ni-Sepharose high-performance resin (GE Healthcare, Piscataway, NJ) was added to the crude extracts. The protein-resin mixture was loaded into a column, and the column was washed with 40 bed volumes of buffer A containing 10 mM imidazole, followed by 60 bed volumes of buffer A containing 70 mM imidazole. His6-tagged GylR was eluted from the resin with 10 bed volumes of buffer A containing 250 mM imidazole. Imidazole and NaCl were removed from purified GylR by means of a PD-10 desalting column (GE Healthcare) equilibrated with 20 mM Tris-HCl (pH 8.0) or 20 mM sodium phosphate (pH 7.4). Purification of Crp (MSMEG_6189) was conducted using E. coli BL21(DE3) strain carrying pT7-7crp in the same manner in which GylR was purified.
Gel filtration chromatography using fast-performance liquid chromatography (FPLC).The quaternary structure of purified GylR was determined by gel filtration chromatography using an ÄKTA FPLC system (GE Healthcare) at a flow rate of 0.5 ml/min at 4°C. After the column (Superose 12 10/300 GL; GE Healthcare) was equilibrated with 20 mM sodium phosphate (pH 7.4) buffer, 5 nmol of the purified protein was subjected to chromatography. The molecular mass of GylR was extrapolated from the standard curve generated using proteins of known molecular masses (β-amylase [200 kDa], bovine serum albumin [66 kDa], and carbonic anhydrase [29 kDa]; Sigma, St. Louis, MO). When required, GylR was preincubated for 30 min at 25°C with G3P (Sigma) to reach the final concentrations of 10, 20, and 50 mM.
Electrophoretic mobility shift assay (EMSA).EMSA was carried out by the use of an electrophoretic mobility shift assay kit (Invitrogen, Carlsbad, NJ) according to the manufacturer’s instruction. A 237-bp DNA fragment encompassing the upstream region of glpF and a 120-bp control DNA fragment without the GylR- and Crp-binding sites were used in EMSA. The 237-bp DNA fragment was generated by PCR using pUC19glpEMSA_F as a template and the primers (F_glpEMSA and R_glpEMSA). The 120-bp control DNA fragment was amplified by PCR using chromosomal DNA of M. smegmatis mc2155 as a template and the primers (F_ahpC_EMSA and R_ahpC_EMSA). Purified GylR or Crp protein was incubated with 50 fmol of the DNA fragments containing the glpF upstream region and 50 fmol of the control DNA fragments in 20 mM Tris-HCl (pH 8.0) buffer in a reaction volume of 10 μl for 20 min at 25°C. To examine the effect of G3P on the binding of GylR to the DNA fragments, the GylR protein was preincubated for 30 min at 25°C with 50 mM G3P. After the addition of 2 μl of 6× loading buffer (included in the kit), the samples were subjected to nondenaturing PAGE (6% [wt/vol] acrylamide)–0.5× TBE buffer (41.5 mM Tris-borate, 0.5 mM EDTA, pH 8.3) at 70 V for 3 h at 4°C. The gels were stained with SYBR green staining solution (Invitrogen) for 1 h.
DNase I footprinting analysis.DNase I Footprinting was carried out using fluorescence (TAMRA)-labeled DNA fragments and purified GylR protein. TAMRA-labeled DNA fragments (237 bp) containing the glpF-glpR intergenic region were generated by PCR using primers F_TAMRA_pUC19 and R_glpEMSA. pUC19glpEMSA_F was used for PCR as a template. The PCR products were purified after agarose gel electrophoresis, and then DNA concentrations were determined using a NanoDrop 2000 spectrophotometer (Thermo). DNA binding reaction mixtures were composed of 5 pmol of labeled DNA probes, purified GylR (1.5 or 3.0 nmol), 20 mM Tris-HCl (pH 8.0), 0.6 mM MgCl2, 5.6 mM KCl, 0.1 mM dithiothreitol (DTT), and 11.1% (vol/vol) glycerol in a final volume of 190 μl. Then, the mixtures were incubated for 10 min at 25°C. DNase I (TaKaRa, Tokyo, Japan) was diluted in buffer containing 20 mM Tris-HCl (pH 8.0), 1 mM MgCl2, 50 mM NaCl, 1 mM DTT, and 10% (vol/vol) glycerol to reach a final concentration of 8 mU/μl. DNase I digestion was initiated with the addition of 10 μl of diluted DNase I to the binding reaction mixtures, conducted for 30 s at 25°C, and stopped with the addition of 400 μl of stop solution containing 40 mM EDTA–20 mM Tris-HCl (pH 8.0). DNA was purified by phenol-chloroform/isoamyl alcohol (25:24:1) extraction and isopropyl alcohol precipitation. The pellets were dissolved in loading buffer (5:1 [vol/vol] mixture of deionized formamide and 25 mM EDTA [pH 8.0] with 50 mg/ml blue dextran) and analyzed by electrophoresis on 6% (wt/vol) denaturing polyacrylamide gels with 7 M urea in 0.8× Tris-taurine-EDTA (TTE) buffer using an ABI PRISM 377 DNA sequencer (Applied Biosystems, Foster City, CA). Reference sequencing was performed by using a Thermo sequenase dye primer manual cycle sequencing kit (USB, Cleveland, OH) with the primer F_TAMRA_pUC19 and the template plasmid pUC19glpEMSA_F.
ACKNOWLEDGMENTS
This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF-2017R1A2B4008404).
FOOTNOTES
- Received 1 August 2019.
- Accepted 26 September 2019.
- Accepted manuscript posted online 30 September 2019.
Supplemental material for this article may be found at https://doi.org/10.1128/JB.00511-19.
- Copyright © 2019 American Society for Microbiology.