ABSTRACT
Several experimental approaches were used to construct a detailed transcriptional profile of the phylogenetically conserved ftsZ cell division gene cluster in both Mycoplasma genitalium and its closest relative, Mycoplasma pneumoniae. We determined initiation and termination points for the cluster, as well as an absolute steady-state RNA level for each gene. Transcription of this cluster in both these organisms was shown to be highly strand specific. While the four genes in this cluster are cotranscribed, their transcription unit also includes two genes of close proximity yet disparate function. A transcription initiation point immediately upstream of these two genes was detected in M. genitalium but not M. pneumoniae. In M. pneumoniae, transcription of the six genes terminates at a poly(U)-tailed hairpin. In M. genitalium, this transcription terminates at two closely spaced points by an unknown mechanism. Real-time reverse transcription-PCR analysis of this cluster in M. pneumoniae shows that mRNA levels for all six genes vary at most fivefold and form a gradient of decreasing quantity with increasing distance from the promoter at the beginning of the cluster. mRNA from coding regions was approximately 20- to 100-fold more abundant than that from intergenic regions. We estimated the most abundant mRNA we detected at 0.6 copy per cell. We conclude that groups of functionally related genes in M. genitalium and M. pneumoniae are often preceded by promoters but rarely followed by terminators. This causes functionally unrelated genes to be commonly cotranscribed in these organisms.
The current closest approximation to a “minimal genome” found in nature is Mycoplasma genitalium, whose chromosome is a mere 580 kb (11). Although parasitic, this mollicute can be grown in pure culture and therefore has the smallest known genome of any organism that can replicate autonomously. M. genitalium's closest relative is Mycoplasma pneumoniae, which contains the same gene set but with about 200 additional genes (12). Almost one-third of the roughly 300 genes estimated to be essential in M. genitalium have no known function (13). Little is known about gene expression in mycoplasmas, but they seem to lack several important regulatory mechanisms present in higher bacteria, including multiple sigma factors and the transcription termination factor Rho (11).
Transcriptional analyses of gene clusters can help elucidate gene function and transcriptional mechanisms for control of gene expression. Currently the only detailed transcriptional analyses of gene clusters in M. genitalium and M. pneumoniae involve the study of cytadherence-related genes specific to Mycoplasma (9, 21, 31). We sought to create a detailed transcriptional profile of a well-conserved cell division gene cluster (Fig. 1A). An understanding of these genes and their functions is important in defining the minimal genetic requirements for bacterial cell division.
(A) To-scale schematic of the ftsZ gene cluster and surrounding genes in M. pneumoniae and M. genitalium. Each gene is identified by its name in each of these species, as well as by the COG to which it belongs (27). Common names are indicated for some genes. Numbered bars show where we attempted to detect transcription across gene junctions with RT-PCR. A solid line represents a positive result, and a dashed line represents a relatively weak result. (B) Agarose gel visualization of RT-PCR products as numbered in panel A. Targets 1 to 6 are strand-specific reactions. Target 7 represents long-range RT-PCR across two gene junctions. Lanes + are positive PCR controls with genomic DNA as template. Lanes F represent use of the forward primer in the reverse transcription reaction, which will detect negative-strand RNA; lanes R represent use of the reverse primer in the reverse transcription reaction, which will detect positive-strand RNA. Lanes N are negative controls for DNA contamination done without the reverse transcriptase enzyme.
This cluster contains ftsZ, which is the most conserved of all known bacterial cell division genes and codes for a homolog of tubulin involved in mechanical invagination of a dividing cell (18). This gene is often clustered with other genes to form the dcw, or division and cell wall, gene cluster (18). In Escherichia coli, this cluster contains 15 genes, while in Bacillus subtilis it is composed of 16 genes. In M. genitalium and M. pneumoniae, which lack a cell wall, this cluster comprises just four genes, with ftsZ at the 3′ end. A cataloguing of nonessential genes in M. genitalium and M. pneumoniae through global transposon mutagenesis did not identify transposon disruptions of any of these four genes (13). This experimental evidence suggests that these genes are essential in these organisms.
All but one gene in the cluster have orthologs in other species as determined by cluster of orthologous groups (COG) analysis (27). The first gene in this cluster belongs to COG2001 and has an unknown function. Its protein product has recently been crystallized from M. pneumoniae and contains a novel fold (4). The second gene belongs to COG0275 and is present in all 145 bacterial species currently catalogued by the STRING database (version 6.0) (30). It is annotated through COG analysis as a predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis; this is supported by experimental evidence (3). Neither of these genes is essential in E. coli (1, 7, 19). MG223/MPN316, the third gene in the cluster, has an unknown function and does not belong to a COG.
Two additional genes of disparate function follow this cluster very closely. Separated from ftsZ by only 45 bp (M. genitalium) or 42 bp (M. pneumoniae) are two amino acid permeases that are both members of the same COG. There may not be room for transcription termination signals within small intergenic spaces. If this is the case, the permeases may be transcriptionally linked to the conserved gene cluster, even though they do not seem to share any functional similarity. Neither the hairpin terminator prediction program TransTerm (10) nor GeSTer (29) predicts a terminator between ftsZ and these genes.
A strongly-predicted poly(U)-tailed hairpin terminator flanks the second amino acid permease in M. pneumoniae (as detected by the program TransTerm) (10), but no corresponding hairpin exists in M. genitalium. The intergenic gap following the second of these permeases is very large at 471 bp (M. genitalium) or 494 bp (M. pneumoniae). Analysis of gene spacing within clusters suggests that related genes are unlikely to be separated by more than 300 bp (23), so the 3′ end of this cluster lies at this junction. MG221/MPN314 is the first gene in this cluster, since the upstream gene is transcribed from the opposite strand.
We took several approaches to mapping transcription units for this group of six genes, including identifying initiation and termination sites. We also used real-time quantitative reverse transcription-PCR (RT-PCR) to compare relative as well as absolute steady-state mRNA levels among multiple genes and in intergenic regions.
MATERIALS AND METHODS
Cell culture. M. pneumoniae strain M129-B170 and M. genitalium strain G-37 were each cultured in SP-4 medium (28) (except that 103 U/ml penicillin was used and l-glutamine was added separately) in tissue culture flasks at 37°C. Cultures were harvested when the medium turned orange from red.
Genomic DNA isolation. M. pneumoniae and M. genitalium genomic DNAs were isolated by a standard protocol of cell resuspension in Tris-EDTA, lysis with sodium dodecyl sulfate, RNase A and proteinase K digestions, and phenol-chloroform extraction followed by ethanol precipitation and resuspension.
Cellular RNA isolation. M. pneumoniae and M. genitalium total cellular RNAs were isolated using the RNeasy Mini kit (QIAGEN, bacterial protocol), treated with DNase I (Promega or Ambion), and repurified with the RNeasy Mini kit (cleanup protocol).
Strand-specific RT-PCR.The strand-specific RT-PCR assay combined several approaches used to circumvent endogenous priming and maximize RT-PCR strand specificity (6, 16, 17). Total cellular RNA (0.5 μg) of either M. pneumoniae or M. genitalium was used as a template for reverse transcription with Tth polymerase (Applied Biosystems), according to the manufacturer's protocol. Two reactions were performed separately for each target, using either a forward or reverse primer consisting of an 18- to 23-nucleotide (nt) gene-specific sequence attached to an 18-nt “tag” with no significant homology to either mycoplasma genome (Table 1). All forward primers contained one tag sequence, and all reverse primers contained a different tag sequence. Reactions were hot started by preincubating an RNA and primer mix separately from a mixture of the remaining reaction components for 5 to 8 min at 65°C, combining the two, and then allowing the reaction to proceed for 15 min at 50 to 60°C. Reactions were stopped by the addition of 5.0 μl 1× chelating buffer (Applied Biosystems), digested for 45 min at 37°C with 10 U exonuclease I (U.S. Biochemical Corp.), and heat inactivated for 15 min at 80°C. One microliter of each reaction mixture was used as template in a 25.0-μl PCR with Taq polymerase (QIAGEN) and either the forward tag and a tagged reverse primer (for cDNA made with the tagged forward primer) or the reverse tag and a tagged forward primer (for cDNA made with the tagged reverse primer). Reactions with genomic DNA template were performed as a positive control, using a forward-tagged primer and a reverse-tagged primer.
Primer sequences used in this study
Long-range RT-PCR.RT-PCR generation of amplicons greater than about 1 kb was performed using the cMaster RTplusPCR system and cMaster RT kit (Eppendorf) following the two-step RT-PCR protocol of the manufacturer. Reaction mixtures contained 0.5 μM of a reverse-tagged primer, about 1.25 μg of either M. pneumoniae or M. genitalium RNA, and 7.5 U of reverse transcription enzyme. They were performed with and without the reverse transcription enzyme as a negative control for DNA contamination and were incubated at 42°C for 90 min. Each cDNA was used as PCR template in a 25.0-μl reaction set up per the kit protocol, using a forward primer and a reverse tag. Reactions with genomic DNA template were performed as a positive control, using a forward-tagged and a reverse-tagged primer. Reactions were amplified by incubation at 94°C for 3 min followed by 35 cycles of 93°C for 15 s, 55°C for 20 s, and 68°C for 5 min, with a final elongation of 10 min.
Primer extension.Primer extension was performed using the Primer Extension System-avian myeloblastosis virus (AMV) reverse transcriptase (Promega) with 10 to 90 μg M. pneumoniae RNA. Primers (UNC-Chapel Hill Nucleic Acid Core Facility) were end labeled with [γ-32P]ATP (Perkin-Elmer) using T4 polynucleotide kinase (Promega). Products were analyzed on an 8% 7 M urea denaturing polyacrylamide gel adjacent to sequencing reactions (T7 Sequenase version 2.0 DNA sequencing kit; U.S. Biochemical Corp.) performed using the corresponding primer extension primer and an M. pneumoniae PCR product that spanned the appropriate region.
RNase protection.RNase protection assays were performed using an RPAIII kit (Ambion), following the manufacturer's protocol, with 10 to 30 μg template M. pneumoniae or M. genitalium RNA. Probes were in vitro transcribed and labeled with [α-32P]UTP at 800 Ci/mmol (Perkin-Elmer) using a Maxiscript kit (Ambion) and gel purified. Probe template consisted of a PCR product amplified from M. pneumoniae or M. genitalium genomic DNA ligated to a T7 RNA polymerase promoter which was then itself PCR amplified. The T7 promoter was made by annealing two oligonucleotides (UNC Nucleic Acids Core Facility) after phosphorylation of the antisense oligonucleotide. In later experiments, the appropriate primer was simply designed to contain the T7 sense oligonucleotide sequence, depending upon the desired polarity of the transcript. Reactions were run adjacent to standards transcribed from the RNA Century Marker Plus template set (Ambion) on 4 to 5% 7 M urea polyacrylamide gels.
Quantitative RT-PCR.Each cellular RNA target amplified by Taqman real-time PCR was quantitated relative to a standard curve generated from a recombinant RNA (for details of the analysis, see the supplemental material). cDNA was generated using 0.5 μg M. pneumoniae total cellular RNA or 1010 copies of an in vitro-transcribed recombinant RNA as a template in a 20.0-μl reverse transcription reaction with Tth polymerase (Applied Biosystems) and 0.75 μM of a gene-specific primer (Integrated DNA Technologies). Reaction mixtures were incubated at 65°C for 5 min followed by 60°C for 15 min, then diluted 1:5 with diethylpyrocarbonate (DEPC)-treated distilled water. Serial 1:10 dilutions were made of each cDNA reaction mixture, and 5.0 μl was used as a PCR template as follows: with the 10−2 dilution of the cDNA made from total cellular RNA and with the 10−2 through 10−6 dilutions of the cDNA made from recombinant RNA. Real-time PCR was performed with dual-labeled 6-carboxyfluorescein-6-carboxytetramethylrhodamine (FAM-TAMRA) probes at 250 nM (Sigma) and primers (Integrated DNA Technologies) at 900 nM, using Taqman Universal PCR Master mix (Applied Biosystems) on an ABI7000 thermocycler (Applied Biosystems). For each assay, reverse transcription reactions performed without enzyme and PCRs performed without template were included as controls for DNA contamination. Primers and probes were designed using Primer Express 2.0 software (Applied Biosystems). PCR cycling conditions were 1 cycle of 50°C for 2 min, followed by 1 cycle of 95°C for 10 min, followed by 50 cycles of 95°C for 15 s and 60°C for 1 min.
Quantitative RT-PCR standard preparation.Each quantitative RT-PCR standard contained the same 0.1-kb sequence as a cellular target, but with about 1 kb of surrounding M. pneumoniae genomic sequence. This was done in order to mimic the relatively small targets in their expected cellular context, since it could be more efficient to amplify a small target in its entirety than a small target within a longer sequence. Recombinant RNA templates were made as described for RNase protection assay probes, and recombinant RNAs were in vitro transcribed using a MEGAscript T7 kit (Ambion). Reaction mixtures were treated with DNase (Ambion) and purified with RNeasy Mini columns (QIAGEN). Size and intactness of transcripts were verified by formaldehyde agarose gel electrophoresis. Each recombinant RNA was quantitated by the average of six absorbance readings and divided into single-use aliquots.
Determination of reverse transcription efficiency.To determine the reverse transcription efficiency of Tth under the conditions used to generate cDNA for quantitative PCR, fluorescence signals were compared between the same quantities of a double-stranded plasmid DNA molecule (pTRI-Xef1; Ambion) and a recombinant RNA in vitro transcribed from the same plasmid. Primers (Integrated DNA Technologies) and probe (dual-labeled FAM-VIC; Applied Biosystems) were designed using Primer Express 2.0 software (Applied Biosystems). The following four templates were reverse transcribed with Tth as described above for quantitative RT-PCR: 1010 copies of recombinant RNA and separately 106, 107, and 108 copies of recombinant RNA with 0.5 μg M. pneumoniae cellular RNA. Each of the resulting cDNA preparations was then serially diluted and PCR amplified to create a three-point standard curve. A commensurate curve was created with plasmid DNA. The assay was repeated three times. The four curves were compared at their median dilution to take the average of any differences in efficiency. We calculated the reverse transcription efficiency as the difference in cycle threshold (CT) of the recombinant RNA from the plasmid DNA, assuming 100% PCR efficiency and factoring in the difference in template amount because the plasmid DNA was double stranded and the recombinant RNA was single stranded. This percent efficiency is [2CT(plasmid DNA) − CT (recombinant RNA) × 2]× 100.
Determination of total RNA per cell.Total cellular RNA and genomic DNA were each isolated separately from the same M. pneumoniae culture with TRI reagent (Molecular Research Center, Inc.) according to the manufacturer's protocol, and each was quantitated by an average of two absorbance readings. Five sets of isolations performed separately on two to three cell pellets each time yielded an average RNA/DNA ratio of 1.9. Ratios for each set were 0.9, 1.7, 1.8, 2.0, and 2.1; the lowest measurement was discarded. The molecular weight of an M. pneumoniae chromosome was calculated at 504,368,278.6, and we assumed two chromosomes per cell.
Calculation of absolute numbers of mRNA molecules per cell.The most abundant cellular RNA we quantitated (amplicon 2 in Fig. 4) generated a fluorescence signal very close to the 10−4 dilution of the standard curve to which it was compared. This dilution corresponds to 5 × 104 template recombinant RNA molecules. This signal was generated from 2.5 × 10−4 μg RNA. We estimated the amount of M. pneumoniae RNA per cell at 3.2 × 10−9 μg. Therefore, we detected 5 × 104 cellular RNA molecules in approximately 78,125 cells, which is 0.64 molecule per cell.
RESULTS
All six genes in the cluster are cotranscribed strand specifically.We used RT-PCR to establish whether adjacent genes are cotranscribed. The assay was made strand specific in order to know the polarity of detected RNAs (see Materials and Methods). The processivity of Tth reverse transcriptase limited our regions of analysis to about 1 kb each (Fig. 1A), which were centered across gene junctions. This tested for the presence of a transcript extending about halfway into each open reading frame of adjacent genes. All six junctions tested, from MPN314 to -319, showed transcripts of the expected polarity (Fig. 1B). The same results were obtained for MG221 to -226. In several cases, transcripts of the opposite polarity were also detected, but these quantities were relatively small compared to their complements.
The RT-PCR approach was also extended to encompass longer regions. A more processive reverse transcriptase was able to detect a 3.7- to 3.9-kb transcript extending from ftsZ through MPN319/MG226 (amplicon 7, Fig. 1). The results are weak, but this may be due to amplicon size rather than transcript abundance.
A promoter lies immediately upstream of the two amino acid permeases in M. genitalium but was not detected in M. pneumoniae.To expand upon our RT-PCR approach, primer extension (Fig. 2) and RNase protection (Fig. 3) were also used to examine transcripts in this region. RNase protection identified transcription initiation just upstream of MPN314, and primer extension mapped two transcription start sites in this region. RNase protection also mapped transcription initiation just 5′ to the orthologous gene MG221. It failed to detect transcription initiation immediately preceding each of the genes MPN315 to -319. Several RNase protection probes were used in the regions spanning the MPN317 to -318 and MPN318 to -319 junctions; all of these failed to map transcript termini. Similarly, RNase protection did not detect transcription initiation points immediately preceding MG222, MG223, MG224, or MG226.
Primer extension determination of the first transcribed nucleotides for MPN314. Autoradiographic results shown were obtained with primer 1; the use of primer 2 produced the same results but only with significantly more template RNA, probably due to primer-specific factors. The same oligonucleotide was used in both the sequencing (lanes labeled G, A, T, and C) and the primer extension (lane PE) reactions to generate the sequence complementary to the mRNA. For primer extension, the primer was 5′ phosphorylated to include a radiolabel, causing its mobility to shift about 1 nt downward relative to the sequencing products. The two results are shown mapped in large capital letters onto the relevant sequence. The predicted −35 and −10 promoter regions and +1 nt (marked with gray boxes) are as determined by the mycoplasma promoter matrix (34). The coding region for MPN314 is italicized. The positions of the two primers are underlined.
(A) To-scale schematic showing regions of RNase protection analysis of and surrounding the ftsZ gene clusters of M. genitalium and M. pneumoniae. Probe positions are shown by numbered arrows, where direction indicates polarity. Mapped transcription initiation points are marked by short arrows; mapped transcription termination points are marked by diamonds. RT-PCR targets are marked for reference as in Fig. 1. (B) Autoradiographic results are shown only for those probes that mapped transcript termini; full protection of the probe was observed for many junctions (see the supplemental material). Genomic coordinates are listed for each probe as obtained from GenBank sequences NC_000912 (M. pneumoniae) and NC_000908 (M. genitalium). Three reactions are shown with each probe. As numbered, they are as follows: 1, probe hybridized with mycoplasma RNA and digested with an RNase A/T1 mix (except for MG probe 8, which was digested with RNase I); 2, probe hybridized with a corresponding amount of yeast RNA and digested with an RNase A/T1 mix (RNase I for MG probe 8); and 3, probe hybridized with the same amount of yeast RNA but incubated without RNases. The ladder is composed of 100-, 200-, 300-, 400-, 500-, 750-, and 1,000-nt fragments (RNA Century Marker Plus template set; Ambion). The size of the probe, including 12 to 13 nt of T7 promoter sequence, is labeled. The size of each digested species is also labeled. This was determined by using ImageJ software (National Institutes of Health) to mark a coordinate for each of the ladder fragments and the resulting fragments and then using DNASIZE (25) to calculate the size of the unknown fragment. Accuracy of this assay was determined to be within 30 nt (from −20 to +10 nt) of the calculated result.
However, RNase protection did detect transcription initiation immediately upstream of MG225 (probe 15, Fig. 3), and a doublet band of protected probe indicates a heterogenous transcription initiation site, as for MPN314. RNase protection analysis of the junction between the two amino acid permeases supported the notion that these genes are cotranscribed in both species. It established that transcription terminates downstream of the second amino acid permease (MPN319/MG226).
RNase protection also roughly established the transcription start site for MPN320, which is downstream of the annotated translation start. However, it is upstream of the annotated translation start for the orthologous gene in M. genitalium, which was confirmed by RNase protection.
Transcript levels for genes in the M. pneumoniae cluster vary at most fivefold, and form a gradient of decreasing quantity with increasing distance from the promoter.To complete our profile of the ftsZ cluster, we determined relative transcript amounts of MPN314 to -319 with Taqman real-time RT-PCR (Fig. 4). We quantitated the level of mRNA at one spot within the coding region of each gene, about 200 nt downstream of the gene's start codon. This was done as a control for the possibility that each gene is transcribed separately. If this is the case, transcript levels should be measured at the same distance from each promoter in order to minimize the possible effects of RNA polymerase processivity on our results. We also measured the level of transcription across the MPN317-318 junction. In addition, we measured mRNA levels in two noncoding regions where we do not expect significant transcription to occur. This measured putative “baseline” levels of transcription by providing a comparison relative to levels within coding regions, allowing us to determine their significance. One region analyzed lay in the noncoding junction between MPN313 and MPN314, flanking a tRNA sequence. The other region lay about 70 nt downstream of the transcription terminator immediately following MPN319.
Quantitative Taqman real-time RT-PCR results for nine regions of analysis within and flanking the M. pneumoniae ftsZ gene cluster. These regions are represented as numbered black arrows on a to-scale diagram. Results have been normalized to the average value of the 313/314 amplicon (1); each point on the graph represents the result for one run. The average value for each amplicon is represented by a labeled bar whose length is to scale. The average for each amplicon comes from at least three repeated experiments and is relative to a standard curve of a recombinant RNA. The trendline shows an exponential fit to the average results for amplicons 2 to 8.
Transcript levels measured in noncoding regions agreed very closely. Transcript levels measured in coding regions were significantly higher (a minimum of 23-fold). Transcript levels among coding regions varied at most fivefold. We saw a general decrease in transcript quantity with increasing distance from the transcription initiation point mapped just upstream of MPN314.
The transcript level of the M. pneumoniae gene cluster is about half a copy per cell.We converted from relative RNA quantities to absolute numbers of RNA molecules per cell by estimating two factors. One was the reverse transcription efficiency of our assay, and the other was the amount of total RNA per M. pneumoniae cell.
We used real-time PCR to amplify the same sequence from identical quantities of DNA and RNA. This told us what fraction of RNA molecules were converted into cDNA. We estimated very similar reverse transcription efficiencies for the range of RNA quantities and contexts we amplified with real-time PCR, assuring us that efficiency did not vary with sample type in our assay. We calculated reverse transcription efficiency at about 125%. This number was over 100%, possibly due to an overestimation of the amount of plasmid DNA or to less efficient amplification of the DNA than the cDNA because it is double stranded. We assumed 100% reverse transcription efficiency in our calculations of absolute mRNA quantities.
In order to estimate the amount of total M. pneumoniae RNA per cell, we isolated both total RNA and DNA from the same M. pneumoniae culture. This allowed us to obtain a ratio of total cellular RNA to genomic DNA. Assuming two DNA molecules per cell, this gave us a ratio of RNA to cell number. We obtained an average of 3.2 × 10−9 μg RNA per cell, about double the estimation of Weiner et al. (35). We consider this to be good agreement and therefore feel that these estimates are reliable.
This quantitated mRNAs at seven places within and across coding regions (amplicons 2 to 8; Fig. 4) from 0.13 to 0.64 molecule per cell, and at two places within noncoding regions (amplicons 1 and 9, Fig. 4) at 0.006 molecule per cell (see Materials and Methods).
DISCUSSION
Functionally unrelated genes may be commonly cotranscribed in M. genitalium and M. pneumoniae due to a lack of transcriptional terminators.We mapped transcripts for four genes that comprise a conserved cell division gene cluster, as well as two genes of disparate function closely spaced with this cluster. Transcription of this region was strongly, although not absolutely, strand specific of the expected polarity. Several areas of RT-PCR analysis seemed to show a relatively small amount of transcription from the unexpected strand. These areas were nonorthologous between M. genitalium and M. pneumoniae. This may have occurred because the assay is not completely strand specific. Alternately, very low levels of transcription may occur from the noncoding strand, perhaps from transcription of upstream genes that is not discretely terminated. We might expect these rare transcripts to be detected for some regions and not others if primers have different efficiencies.
Our results suggest the transcripts illustrated in Fig. 5. Both RT-PCR and RNase protection results support cotranscription of all six of these closely spaced genes in M. pneumoniae. RT-PCR results also support this conclusion in M. genitalium. However, RNase protection revealed the presence of a transcription initiation site within the group of M. genitalium genes. This allows for separate transcriptional control of the expression of the two amino acid permeases downstream of the phylogenetically conserved cell division gene cluster. RNase protection did not detect cotranscription of MG224 with MG225, as was indicated by RT-PCR. This may be because the RT-PCR assay is much more sensitive than the RNase protection assay. These results may also indicate that this level of cotranscription is very low compared to the level of transcription initiated just upstream of MG225.
Summary of results of transcriptional analyses of and surrounding the phylogenetically conserved ftsZ gene clusters of M. pneumoniae and M. genitalium. Each diagram is to scale and shows the cluster in the context of surrounding genes of disparate function. This demonstrates how gene clusters may be identified merely as groups of closely spaced genes, regardless of functional information. Transcripts that can be deduced from each type of experimental approach are shown as long arrows below the diagrams. A transcription initiation point is represented by a short arrow, termination at a poly(U)-tailed hairpin by a stem-loop symbol, and termination at a sequence of unknown significance by a vertical line. If an experiment does not define the termini of a transcript, then the region that can be inferred is extended by dotted lines.
Multiple promoters within a gene cluster have been identified for the mgp gene cluster of M. genitalium (21), the orthologous p1 gene cluster of M. pneumoniae (14, 21), and the hmw gene cluster of M. pneumoniae (31), all involved in cytadherence.
Transcription starts were identified for the first two out of three genes comprising the mgp and p1 gene clusters (14, 21). Cotranscription of the genes in the mgp gene cluster was shown by RT-PCR (21). This promoter organization may reflect the phylogenetic conservation and functions of these genes. The first gene in the clusters is a member of a COG that exists across a wide variety of bacteria as well as some archaea. The latter two genes in the clusters are not members of COGs and have been identified only in M. genitalium, M. pneumoniae, and M. gallisepticum. While these latter two genes have been experimentally implicated in cell adhesion in both M. genitalium and M. pneumoniae (8, 15, 20), the first gene has not been characterized.
Transcription starts were identified for 4 out of 10 genes comprising the hmw gene cluster, all of which were shown to be cotranscribed by RT-PCR (31). The genes in this cluster have a diverse group of functions, and only three are members of COGs. The cluster includes three genes involved in M. pneumoniae cytadherence but also a ribosomal protein and a DNA polymerase III subunit. The meanings of the promoter and gene organizations in this cluster are unclear.
This evidence is consistent with the expectation that M. genitalium and M. pneumoniae cotranscribe genes of related function. However, they also cotranscribe genes that are not functionally related. This may be the norm rather than the exception in these organisms. If they possess few signals for discrete and efficient transcription termination, then many genes may be transcribed by “run-on” transcription from upstream promoters. Since no Rho factor has been identified for M. genitalium or M. pneumoniae, they may use poly(U)-tailed hairpins to terminate transcription, as in Rho-independent termination in E. coli. This mechanism does operate, as in the case of the hairpin 3′ to MPN319. However, analyses suggest that few such termination signals exist (10, 32). Intergenic spaces in these organisms are very small, perhaps not leaving room for terminators.
If there are few terminators, then the positions of these elements may not be relevant to the functional grouping of genes. Regardless of whether two genes are cotranscribed, a promoter between the two may indicate that they are not functionally related. For instance, cotranscription evidence suggests that the ribosomal protein in the hmw cluster is part of an operon with upstream genes. However, it has its own promoter and is the 3′-terminal gene in the cluster. These features could suggest that it is not involved in cytadherence.
It is unclear how strongly M. genitalium and M. pneumoniae regulate transcription initiation. They seem to possess only one sigma factor, with weak evidence for a possible second factor in M. pneumoniae (2). A strong −10 promoter consensus but no strong −35 consensus could be found in M. pneumoniae (34). Is most transcription in these organisms constitutive? If so, then there might be no selective pressure to maintain a promoter for the two amino acid permeases, as they could be transcribed from a promoter further upstream. This may be why we detected transcription initiation immediately upstream of these two genes in M. genitalium but not in M. pneumoniae.
In both species, transcription terminates discretely downstream of the second amino acid permease. The termination point identified in M. pneumoniae corresponds to one of only a few poly(U)-tailed hairpins identified in this species by the program TransTerm (10). Even though we observed termination of transcription just downstream of MG226 and initiation just upstream of MG227, we were also able to detect transcription across the junction of these two genes. This indicates that termination is not 100% efficient. We identified two termination points in this region. The upstream point lies within the roughly 30-bp gap separating MG226 from an MgPa repetitive element (24), or perhaps just inside this element. The other point lies about 40 nt downstream, within a roughly 70-bp CT-rich sequence (approximately 55% C and 33% T) just inside the element. Termination may occur here due to slippage of the RNA polymerase at a region of low complexity, an inadequate supply of C and U ribonucleotides, or impeded strand separation because of the GC-rich sequence. Since this sequence also lies within MG191, where we do not expect transcription termination, we might not expect this to be a conserved mechanism for such an event. We did not find other occurrences of this sequence type in the rest of the genome, outside of other MgPa repetitive elements.
Evaluation of relative and absolute transcript levels.We cannot necessarily construct a profile of an mRNA as it is transcribed, since multicistronic transcripts may be selectively cleaved or degraded at certain sites, producing fragments with different half-lives (5). In our profile of a multikilobase region, we interpolated between small regions of analysis. Nevertheless, we were able to see a trend in the fivefold difference in mRNA quantity among the six genes we examined (Fig. 4). mRNA levels were highest at the beginning of the cluster and decreased with distance from the mapped transcription start site for MPN314. We quantitated mRNA levels of MPN314 and -315 at about twice those of MPN316 and MP317, which in turn were about twice those of MPN318 and MPN319. Given the resolution of our assay (about twofold), these quantities were similar to each other. Therefore, the trend they formed was very gradual.
This trend supports our evidence that there is no promoter immediately upstream of the two amino acid permeases in M. pneumoniae, although there is in M. genitalium. If this were the case, we might expect to see higher transcript levels for the first permease than for the gene preceding it. However, we might have failed to detect an initiation point here if it were very weak. Assuming one transcription unit for this cluster, this suggests that we may be viewing an increased dissociation of the RNA polymerase from the template with distance from the promoter. Normalized microarrays showed gradients in transcript levels across several operons in E. coli, while other operons showed similar mRNA levels for each of their genes (33). The gradient seen is probably not due to degradation, as transcripts in E. coli are degraded 5′ to 3′ rather than 3′ to 5′ (26).
We saw significant differences (at least 20-fold) between transcript levels within coding regions versus within intergenic regions, where we did not expect transcription to occur. Intergenic transcript levels were 10-fold above our limit of detection and agreed very closely. This difference in quantity was not distinguishable through end-point analysis of the real-time RT-PCRs. This cautions us that not all positive RT-PCR results may indicate biologically relevant amounts of mRNA.
We can compare our calculations of absolute RNA quantities per cell to those of another study done recently. A real-time PCR analysis of nine genes in M. pneumoniae, chosen to represent the range of signals seen in microarray hybridizations, produced an estimate for the most abundant mRNA of about 0.05 copy of a transcript per cell under 37°C growth conditions (35). The authors suspect that their estimates of transcript levels are about 10-fold too low. Our estimates agree with this conclusion, since their most abundant mRNA quantity would correlate with our findings. The least abundant mRNA they measured would be about half the quantity of the noncoding mRNAs we examined. Overall our estimates of transcript quantities are higher than theirs, but still comparable.
We can also evaluate our results using our estimate of the amount of total RNA per M. pneumoniae cell. If this is 3.2 × 10−9 μg, 4% of that is mRNA (as estimated for E. coli) (22), and 1 nt of single-stranded RNA has an average molecular weight of 320.5, then the cell contains about 2.40 × 105 nt of mRNA, or 240 transcript lengths of 1 kb each, the average size of a prokaryotic gene. Microarray analysis of the M. pneumoniae genome detected transcription for 90%, or about 600, of its genes at 37°C (35). Assuming that the 2.40 × 105 nt of mRNA per cell does not represent more than one transcript of any region, this gives an average of about 0.4 copy per cell of a transcript for any given protein-coding gene. It may seem counterintuitive that this number is less than 1. However, even a minimal cell may transcribe different genes at different times. Since a bacterial mRNA exists on average only a couple of minutes, transcripts for all genes might not be present in a cell at the same time.
Since the mRNA copy numbers we derived seem reasonable, this demonstrates the effective use of real-time RT-PCR for the absolute quantitation of mRNAs in a cell. What do these numbers mean in context? Further work on a number of genes will be needed to establish a relative expression profile for M. pneumoniae. This will tell us what mRNA quantities constitute low, average, and high gene expression for this bacterium at the transcriptional level.
ACKNOWLEDGMENTS
We thank Michael Conrad for useful theoretical and technical discussions.
This work was supported by a Public Health Service research grant subcontracted to UNC-CH (Clyde A. Hutchison III) from the Berkeley Structural Genomics Center (GM62412, Sung-Hou Kim).
FOOTNOTES
- Received 11 January 2005.
- Accepted 29 March 2005.
- Copyright © 2005 American Society for Microbiology