Previous Article | Next Article ![]()
Journal of Bacteriology, October 2007, p. 6787-6795, Vol. 189, No. 19
0021-9193/07/$08.00+0 doi:10.1128/JB.00882-07
Copyright © 2007, American Society for Microbiology. All Rights Reserved.

Department of Biology, Concordia University, Montréal, Québec, Canada H4B 1R6
Received 5 June 2007/ Accepted 15 July 2007
|
|
|---|
|
|
|---|
Previous studies have shown that cellulolytic activity in C. thermocellum is regulated by either carbon source or growth rate (or both) and that changes with respect to one or the other are reflected in overall cellulase production (47) and in the cellulosomal subunit profile (4, 11, 28, 35). Catabolite repression by nonlimiting concentrations of readily metabolized carbon sources has been the standing hypothesis for cellulase regulation in C. thermocellum for more than 20 years (12). The immediate availability of energy results in an increased growth rate and leads to the repression of genes required to mine energy from crystalline cellulose. Lower growth rates and cellulose as a substrate seem to promote cellulase production, as has been demonstrated for the processive glycoside hydrolase family 48 (GH48) exoglucanase CelS, both at the protein (4) and the mRNA level (7, 38), as well as for the transcription of the GH5 endoglucanases celB and celG and the GH9 endoglucanase celD (9). Transcription of the scaffoldin gene cipA and cell surface anchoring genes olpB and orf2p is likewise controlled by growth rate and/or carbon source, which is not the case for another cell surface gene, sdbA (8, 38).
Sequencing and annotation of the C. thermocellum ATCC 27405 genome led to the discovery of more than 60 open reading frames coding for products with putative Doc1 domains (50), that is, proteins that can potentially bind to CipA and contribute to cellulosomal activities. Among these are genes for endoglucanases, exoglucanases, xylanases, and other hemicellulases. The predicted catalytic activity or function of about one-quarter of these genes is unknown. Considering the number of "dockable" candidate open reading frames, relatively few, or about one-third, of the products of these genes have been identified from the cellulosome complex itself. The participation in the cellulosome of the remaining putative gene products remains moot.
Low expression levels and overlapping and/or novel biochemical activity not detected by frequently used activity assays can account for the difference between the number of cellulosomal proteins predicted and the number of those that have been biochemically characterized. Mass spectrometry (MS) has become an increasingly popular tool in the study of proteins due to its high sensitivity and mass accuracy, and its quantitative applications are being progressively refined (36). The most wide-ranging C. thermocellum cellulosome study until now coupled a two-dimensional gel electrophoresis system with protein mass fingerprinting by matrix-assisted laser desorption ionization MS, giving rise to the simultaneous identification of 13 docking components from a cellulose-grown culture (50).
In the present study, we report quantitative differences between the subunit profiles of cellulosomes from cells grown in liquid batch cultures on Avicel (crystalline cellulose) versus cellobiose as the carbon source. In comparing the cellulosomes from cells grown on these two substrates, we expected to detect several novel gene products and also to uncover differences in protein expression that can shed more light on our understanding of the regulation of cellulosomal cellulases and hemicellulases. A metabolic isotope-labeling strategy was used in conjunction with nano-liquid chromatography-electrospray ionization MS (nano-LC-ESI-MS) peptide sequencing to assess alterations in the expression patterns within cellulosomes grown under different conditions. Moreover, a peptide-counting technique was applied to approximate the relative abundance of each cellulosome component per sample.
|
|
|---|
Analysis of purified cellulosomes by nano-LC-ESI-MS.
The resulting purified cellulosomes were separated by 6% sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and stained with Coomassie blue. Sample lanes from the gel were excised and divided into 15 gel bands, with each band containing on average roughly 11 µg of protein. The protein in each gel band was subsequently reduced, alkylated, and digested with trypsin TPCK (N-tosyl-L-phenylalanine chloromethyl ketone; Sigma-Aldrich), as described previously (24). The resulting peptide mixtures were removed from the gel pieces using excess extraction buffer, dried, and then made up in equal volumes of 8% (vol/vol) acetonitrile in 0.1% (vol/vol) formic acid. Peptide samples were injected quantitatively for separation on a PicoFrit BioBasic C18 nanocolumn (New Objective; 10-cm length by 75-µm inner diameter; 5-µm particle size; 300-Å pore size) with a 60-min solvent gradient, ranging from 3% to 50% acetonitrile in 0.1% formic acid, at a flow rate of 1 µl · min–1. Before flowing to the column, the sample was cleaned of impurities using a C18 peptide trap. Under these conditions, most peptides eluted in about 30 s or 500 nl. Detection and sequencing of peptide ions was accomplished by an LTQ ion trap MS (Thermo Electron, San Jose, CA), equipped with an ESI nanosource and operating in positive mode with a voltage of 1.4 kV applied at a liquid junction just upstream of the column. An initial full MS survey scan (
10 ms) was performed for the m/z range of 400 to 2,000, followed by several data-dependent scans (
33 ms each). The seven most abundant ions from the survey scan were subjected to tandem MS (MS/MS) for sequencing using pulsed-Q dissociation for ion fragmentation. A triggering threshold of three times the noise level (signal-to-noise ratio [S/N]) was applied for MS/MS events. Peptide ions that triggered an MS/MS more than once within a 30-s window were placed on an exclusion list for 3 min to improve the possibility of detecting less abundant ions.
Database screening and success criteria. Using SEQUEST from BioWorks 3.3 (Thermo Electron), the peptide sequence results were searched against the 16 February 2007 release of the C. thermocellum genome available at NCBI courtesy of the Department of Energy, Joint Genome Institute (http://www.ncbi.nlm.nih.gov; Refseq accession number NC_009012). The database was digested in silico with trypsin and indexed for carboxymethylation of cysteine residues to include masses within the range of 400 to 3,500 Da. A peptide tolerance of ±2 atomic mass units was implemented. Charge state analysis was performed during DTA file filtering, and a series of high-stringency filters was applied to the search results. Singly, doubly, and triply charged peptide ions required SEQUEST cross-correlation (XC) scores of at least 1.8, 2.5, and 3.5, respectively. Peptide and protein hits also needed probability scores, as calculated by BioWorks, of less than 10–3. Moreover, only proteins identified on the basis of two or more unique peptides were considered in the final analysis. The SignalIP 3.0 server (http://www.cbs.dtu.dk/services/SignalP/) was used to verify that proteins contained an N-terminal peptide signaling secretion from the cell (10).
RelEx analysis. DTA files were filtered separately using DTASelect (39), which assembles the peptides into proteins using the same XC score stringency factors as above. The filtered DTA files were then analyzed by RelEx (33), which generates extracted ion chromatograms of peptide isotope pairs and uses the areas under each curve to calculate a peptide signal ratio of sample to isotope-labeled reference. An extracted ion chromatogram pair was rejected if the S/N ratio was below 3 or if the correlation factor, the measure of the overlap of the curves, was below 0.9. Protein ratios were calculated as averages of the ratios of the peptides matched to them. The ratio of each unlabeled Avicel-grown protein to 15N-labeled Avicel-grown protein was divided by the ratio of the corresponding unlabeled cellobiose-grown protein to 15N-labeled Avicel-grown protein. The quotient of the ratios is the ratio of unlabeled Avicel-grown protein to cellobiose-grown protein. In such a way, this strategy corrects for any systematic errors introduced during sample preparation (33). All ratios were normalized to that obtained for the comparison of CipA.
emPAI analysis. The exponentially modified protein abundance index (emPAI), which was shown to bear a linear relationship to protein concentration, is defined as 10PAI minus 1, where PAI is the ratio of the number of MS-observed peptides for a given protein over its theoretically observable peptides (19). The unique peptide parent ions matched for a given protein were counted as its observed peptides. For theoretical peptides, the relative hydrophobicity of a protein's in silico tryptic digest products (no missed cleavages) was calculated using the Sequence Specific Retention Calculator available at http://hs2.proteome.ca/SSRCalc/SSRCalc.html (25). Peptide retention times were predicted based on relative hydrophobicity and coefficients derived from our data set. Theoretical peptides were accepted within a retention time window of 12 to 68 min and a mass window of 400 to 3,500 Da. All emPAI values were normalized to that obtained for CipA, assuming that one CipA protein exists per cellulosome.
|
|
|---|
![]() View larger version (45K): [in a new window] |
FIG. 1. C. thermocellum cellulosomal protein separated by SDS-PAGE (6%), stained with Coomassie blue. Lane A, 1:1 (vol/vol) mixture of unlabeled cellobiose-grown and 15N-labeled Avicel-grown cellulosomes from late stationary phase (170 µg of total protein); lane B, 1:1 (vol/vol) mixture of unlabeled Avicel-grown and 15N-labeled Avicel-grown cellulosomes from late stationary phase (170 µg of total protein). Molecular weight (mol wt) markers are shown at left. At right, the approximate molecular weight ranges for the division of the gel bands for trypsin digestion are shown.
|
|
View this table: [in a new window] |
TABLE 1. C. thermocellum Avicel-grown cellulosomal components identified by nano-LC-ESI-MS, ranked by emPAIa
|
|
View this table: [in a new window] |
TABLE 2. C. thermocellum cellobiose-grown cellulosomal components identified by nano-LC-ESI-MS, ranked by emPAIa
|
|
View this table: [in a new window] |
TABLE 3. Fractional differences in expression of C. thermocellum Avicel-grown cellulosomal components relative to cellobiose-grown components by RelEx, ranked by P value, and normalized to CipAa
|
There are significant differences in the relative abundances of docking subunits per CipA between the two data sets as per molar percentage calculated from emPAI values. Exoglucanases accounted for a total molar percentage of 24.4% of the total moles per CipA of all docking subunits detected when cells were grown on Avicel but only 9.2% when cells were grown on cellobiose. The molar percentage of CelS dropped from 9.4% on Avicel to 1.2% on cellobiose, while values for the GH9 exoglucanases CelK and CbhA changed from 11.0 to 5.8% and 4.1 to 2.1%, respectively. Components with known endoglucanase activity accounted for a total molar percentage of 40.0% when cells were grown on Avicel, but this decreased to 26.1% on cellobiose. In total, GH9 cellulases decreased from 43.6% on Avicel to 19.2% on cellobiose, whereas enzymes containing a GH5 domain increased slightly from 20.2% on Avicel to 23.0% on cellobiose. The GH5 fold is predominantly associated with cellulases, but it has also been linked to hemicellulolytic activity (37). A new GH5 enzyme (gi 125973339) was detected among the most abundant catalytic subunits in both samples (6.9% on Avicel and 5.9% on cellobiose). It has a predicted mass of 63.0 kDa and exhibits SDS-PAGE migration properties similar to those of CelB and CelG, with masses of 63.9 and 63.2 kDa, respectively. Its overlap with these proteins might explain why it was not identified previously. Overall, the molar percentage of hemicellulases increased from 19.9% on Avicel to 50.3% on cellobiose. Docking subunits with xylanase activity accounted for a total of 11.3% of all docking subunits detected when cells were grown on Avicel, but their contribution increased to 34.3% when cells were grown on cellobiose. Other hemicellulases accounted for a total molar percentage of 8.6% on Avicel and 15.1% on cellobiose. GH9 cellulases were the most abundant group of enzymes per CipA when cells were grown on Avicel, while hemicellulases were the most abundant group on cellobiose.
Other notable differences between the two samples concern the 13 components detected exclusively in one sample but not the other. Detected only in Avicel-grown cellulosomes were GH9 endoglucanases CelN and CelQ, the GH16 lichenase LicB, the GH26 mannanase ManA, a new GH9 cellulase, a new subunit with putative endopygalactorunase activity, and a new cell-surface anchor protein predicted to have the same number of type II cohesin domains as OlpB but no SLH (S-layer homology) domain. XynD and XynY, both with GH10 xylanase activity, were detected exclusively in cellobiose-grown cellulosomes, along with the cell-surface anchor protein SdbA, a new bifunctional GH30/
-L-arabinofuranosidase B hemicellulase, a new GH43 glycosidase, and a new bifunctional GH43/
-L-arabinofuranosidase B glycosidase.
Relative differences in abundance of cellulosomal components induced by Avicel or cellobiose. Simultaneous quantitative differences in the expression of all but four cellulosomal components common to both Avicel and cellobiose were measured by means of metabolically 15N-labeled peptides as internal standards. While emPAI supplied a means of determining the relative abundance of proteins in a given sample, RelEx provided a highly reliable way to compare the amount of a particular protein present in two samples. Sample-to-reference ratios were determined separately for Avicel- and cellobiose-grown cellulosomes, and the ratio of ratios represented the fractional difference between proteins grown on either substrate. Normalization of ratio values to the value obtained for the scaffoldin protein CipA allowed for comparison of changes in protein expression per cellulosome complex. That the average ratio of unlabeled Avicel-grown protein to 15N-labeled protein was 1.23 with a standard deviation of 0.29 (Table 3) suggests that our methodology was accurate (and precise) at determining ratios between cellulosomal proteins from two separate samples.
From the total of 29 proteins found in both samples, RelEx was able to determine a ratio of sample-to-reference for 25 protein pairs, given the S/N and correlation filters adopted (Table 3). The null hypothesis was rejected for all but four of these, for which it was determined that P was
0.05. There was no significant change in expression for these four proteins: two new GH9 cellulases and two hemicellulases, ChiA and a new GH53 subunit, whether obtained from Avicel- or cellobiose-grown cells. Proteins for which significant differences were observed are represented visually over a logarithmic scale in Fig. 2.
![]() View larger version (28K): [in a new window] |
FIG. 2. Fractional differences in expression of C. thermocellum Avicel-grown cellulosomal components relative to cellobiose-grown components by RelEx, normalized to CipA, over a logarithmic scale. Docking components are grouped by function and activity. CE, carbohydrate esterase family. Only proteins passing the null value with a P of <0.05 are shown. Columns rising above 1 represent proteins determined to have greater expression in the Avicel-grown sample. Columns falling below 1 represent proteins with higher expression in the cellobiose-grown sample. Error bars traversing 1 signify no change in expression between the two samples.
|
Noncellulosomal proteins detected. Four noncellulosomal proteins with signal peptides for secretion were detected (not shown in Tables 1 or 2). The GH9 endoglucanase CelI (gi 125972564) was detected in the cellobiose cellulosome sample (53). It was identified by two unique peptides. From the Avicel-grown sample only, three unique peptides were matched to a predicted 34-kDa protein (gi 125972914) with similarity (E value of 3E-32) to RbsB (COG1879), a ribose-binding protein in Escherichia coli. This protein also has a lipid attachment site to anchor it to the membrane. In both Avicel- and cellobiose-grown cellulosome preparations, 17 and 10 unique peptides, respectively, matched to a predicted 50-kDa protein (gi 125973535) with similarity (E value of 1E-42) to UgpB (COG1653), a periplasmic glycerol-3-phosphate-binding protein in E. coli. Finally, seven unique peptides from both samples were matched to a predicted 113-kDa protein (gi 125974833) with a possible (E value of = 0.006) SLH domain (pfam00395) for anchoring it to the cell wall and also an immunoglobulin-like fold, which may behave like a carbohydrate binding domain. This protein had been recently observed in the cell membrane fraction (42). All three of the latter proteins were observed in considerable abundance (at least 25% amino acid coverage) in the total extracellular protein fraction from cells grown on cellobiose (data not shown). Their high abundance and, more particularly, the presence in each of them of a possible carbohydrate binding domain point to the possibility that these proteins are contaminants of the cellulosome preparations, consistently copurifying with cellulosome-cellulose complexes. This possibility does not, however, preclude the alternative: that they may in fact be specifically associated with these complexes and play roles in secondary cellulosomal product-related function, perhaps in the uptake of cellodextrins in the manner of RbsB from Bacillus subtilis (43) or MalX from Streptococcus pneumoniae (14), both lipoproteins involved in sugar transport in gram-positive bacteria.
|
|
|---|
The three known docking subunits to escape detection were the noncatalytic docking component CseP (53), the serine protease inhibitor PinA (22), and the bifunctional component CelH (42); however, all three of these were observed by us in earlier trials (data not shown) in which either no reference protein was mixed in or the reference had not been 15N-enriched to 99%. CseP and PinA were detected on both substrates, whereas CelH, which has both a GH5 and a GH26 domain, was detected only on cellobiose. CelO, the only known GH5 exoglucanase in C. thermocellum (52), is the only previously cloned docking gene product never to be detected by us.
XynD was detected exclusively on cellobiose even though it had been discovered on cellulose by MS (50), and ManA and LicB were detected exclusively on Avicel, whereas they had previously been observed on cellobiose by Western blot analysis (15, 49). These discrepancies could be explained by the differences between the protein identification methods used in the previous studies and the method used in the present work.
Growth on the different substrates revealed a similar mix of cellulosomal components that were present in significantly different relative amounts. Differences in the relative expression levels of individual components grown on either carbon source demonstrated GH family-specific regulatory patterns, providing evidence in support of existing hypotheses for cellulosomal component regulation as well as contributing a novel distinction with respect to endoglucanase synthesis.
The exoglucanase CelS exhibited the greatest increase of any docking component during growth on Avicel compared to cellobiose. The increase of CelS on Avicel versus cellobiose had already been observed at the protein level by SDS-PAGE (4) and Western blot analysis (7). This result also agrees with changes in celS transcript levels per cell between growth on cellulose and cellobiose (7). Exoglucanases are the key enzymes in cellulase mixtures effective on crystalline cellulose (40), so it was not surprising that exoglucanase CelK also increased on Avicel, even while the expression of CbhA did not change significantly.
Docking proteins with known endoglucanase activity demonstrated varied expression patterns. The GH5 endoglucanases CelB, CelE, and CelG demonstrated higher expression when cells were grown on cellobiose than on Avicel. The same was true for CelA from GH8. In contrast, CelJ from GH9 showed increased expression on Avicel, while the expression of other GH9 endoglucanases, CelF, CelR and CelT, did not change significantly. The detection of CelN and CelQ on Avicel and not cellobiose may be taken as another indication of increased GH9 endoglucanase production on Avicel. The differential expression of GH9 versus GH5 endoglucanases poses an apparent discrepancy with the recent transcript analysis of Dror et al. (9), who observed increased transcript levels per cell of each of the endoglucanase genes celB and celG from GH5 and celD from GH9 when cells were grown at a low versus a high growth rate and also on cellulose versus cellobiose. Thus, while our results with respect to GH9 endoglucanases agree with these previous findings at the transcript level, the increase of GH5 endoglucanases and of CelA on cellobiose was a somewhat unanticipated result. One possible explanation for the difference between the trends observed at the mRNA and protein levels is that GH9 endoglucanase genes may be more responsive to catabolite repression than celA or GH5 endoglucanase genes, such that the former would be more repressed on cellobiose than either of the latter.
The data suggest that the organism has a "cellulolytic preference" for GH9 endoglucanases when degradation of crystalline cellulose is required. In total, cellulosomal GH9 cellulases contained in the C. thermocellum genome outnumber GH5 enzymes by 14 to 8. This preference could be due to what distinguishes them from CelA and GH5 endoglucanases: the presence, in many instances, of a type IIIc carbohydrate binding module, which has been shown to participate in the catalytic activity of the enzyme (1, 2) and to be responsible for processivity (5, 41). What is more, GH9 endoglucanases carry out different modes of attack on cellulose, resulting in cellodextrins of different lengths (1). CelR, which was the most abundant endoglucanase in cellulosomes from Avicel-grown cells, is one such enzyme, a processive GH9 endoglucanase that produces cellotetraose as its primary hydrolysis product (51), which is more energetically favorable for the cell than production of cellobiose (46).
Finally, with respect to hemicellulases, all subunits with xylanase or xyloglucanase activity decreased on Avicel, as per RelEx and emPAI analysis. XynC production has previously been shown to increase on cellobiose (4, 9), and xynC transcript levels have been found to increase on cellobiose in a growth rate-independent fashion (9). In this study, XynZ, XynA, XynC, and XghA were among the five most abundant docking components in cellobiose-grown cellulosomes, along with CelA. XynD and XynY were not detected in the Avicel sample, possibly because their signals were overwhelmed by those of more abundant subunits. On the other hand, their exclusive detection on cellobiose might be taken as another indication of increased xylanase production on cellobiose. Other new subunits with glycosidase and arabinofuranosidase activities were detected exclusively on cellobiose. The trend of increased hemicellulase production on cellobiose could also explain the increase in the bifunctional subunit CelE, which has a family 2 carbohydrate esterase domain in addition to a GH5. As for other hemicellulases, no change was noted for ChiA, and the appearance of LicB and ManA on Avicel but not cellobiose suggests that transcription of these genes was repressed on cellobiose. In the case of manA, Stevenson et al. (38) reported a 10-fold reduction in its transcript level on cellobiose compared to cellulose. Thus, while xylanase transcription is growth rate independent and increases on cellobiose, chitinase, lichenase, and mannanase appear to be under a different type of regulation mechanism. C. thermocellum is unable to utilize the pentose sugars produced by the action of xylanases and other hemicellulases (6, 12); therefore, the apparent role of hemicellulases is to expose cellulose to the action of cellulases. When the organism is not mining energy from cellulose, as when it is grown on cellobiose, in general it appears to prepare itself to mine cellulose from plant wall materials, hemicellulose and lignin, as it would in its natural ecosystem.
In conclusion, this work provides a global view of the C. thermocellum cellulosome. During growth on two substrates, the organism produced a wide variety of dockable hydrolytic enzymes, accounting for two-thirds of the genes containing Doc1 sequences. Of the remaining unobserved putative dockable gene products, there are six various hemicellulases, one GH9 cellulase, and about 16 proteins of unknown function, which may be inducible using more complex substrates. An understanding of the mechanisms by which bacteria regulate the expression of the various cellulases and hemicellulases at their disposal will be important to the eventual production of optimal enzyme cocktails or designer cellulosomes used in the breakdown of cellulosic materials for the transition from an oil-based to a carbohydrate-based economy.
This work was supported by research grants from the Natural Sciences and Engineering Research Council of Canada (grant numbers 312357-06 and 330781-06) and the Canada Foundation for Innovation (grant number 202359) as well as a Petro-Canada Young Innovator Award to V.J.J.M.
Published ahead of print on 20 July 2007. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»