Previous Article | Next Article 
J Bacteriol, June 1998, p. 3091-3099, Vol. 180, No. 12
0021-9193/98/$04.00+0
Copyright © 1998, American Society for Microbiology. All rights reserved.
Multidomain Structure and Cellulosomal Localization
of the Clostridium thermocellum Cellobiohydrolase
CbhA
Vladimir V.
Zverlov,1
Galina V.
Velikodvorskaya,1
Wolfgang H.
Schwarz,2
Karin
Bronnenmeier,2
Josef
Kellermann,3 and
Walter L.
Staudenbauer2,*
Institute of Molecular Genetics, Russian Academy of
Science, 123182 Moscow, Russia,1 and
Institute for Microbiology, Technical University Munich,
80290 Munich,2 and
Max-Planck-Institute
for Biochemistry, 82152 Martinsried,3 Germany
Received 29 December 1997/Accepted 16 April 1998
 |
ABSTRACT |
The nucleotide sequence of the Clostridium thermocellum
F7 cbhA gene, coding for the cellobiohydrolase CbhA, has
been determined. An open reading frame encoding a protein of 1,230 amino acids was identified. Removal of a putative signal peptide yields
a mature protein of 1,203 amino acids with a molecular weight of 135,139. Sequence analysis of CbhA reveals a multidomain structure of
unusual complexity consisting of an N-terminal cellulose binding domain
(CBD) homologous to CBD family IV, an immunoglobulin-like
-barrel
domain, a catalytic domain homologous to cellulase family E1, a
duplicated domain similar to fibronectin type III (Fn3) modules, a CBD
homologous to family III, a highly acidic linker region, and a
C-terminal dockerin domain. The cellulosomal localization of CbhA was
confirmed by Western blot analysis employing polyclonal antibodies
raised against a truncated enzymatically active version of CbhA. CbhA
was identified as cellulosomal subunit S3 by partial amino acid
sequence analysis. Comparison of the multidomain structures indicates
striking similarities between CbhA and a group of cellulases from
actinomycetes. Average linkage cluster analysis suggests a coevolution
of the N-terminal CBD and the catalytic domain and its spread by
horizontal gene transfer among gram-positive cellulolytic bacteria.
 |
INTRODUCTION |
Numerous proteins of higher
organisms have a multidomain architecture consisting of strings of
mobile modules (10). Many of the modules identified so far
have defined binding functions, but some may just act as simple spacer
elements required only to arrange binding surfaces in space. Common
types of constituent modules found in extracellular mosaic proteins are
the fibronectin type III (Fn3) domain and the variants of the
immunoglobulin (Ig) domain. These modules have very similar
three-dimensional folds that form a sandwich of two antiparallel
-sheets with slightly different strand topologies (7,
25). The broad distribution of these modules in animal proteins
is often regarded as evidence for exon shuffling. Many modules are
found in multiple copies resulting from several gene duplications after
the original shuffling event.
Large mosaic proteins are conspicuously absent in plants and fungi but
appear to be widespread among bacteria. Thus, cellulases and other
glycohydrolases from diverse bacteria have multidomain structures
containing in addition to their catalytic domains several noncatalytic
domains involved in substrate binding or specific protein interactions
(46). A particularly interesting example is the cellulosome
of Clostridium thermocellum, a cellulolytic multienzyme
complex located at the cell surface and consisting of numerous
catalytic components, including
-1,4-endoglucanases, cellobiohydrolases, and hemicellulases attached to the cellulosome integrating protein (scaffoldin) CipA (3, 4). This
attachment is mediated by the conserved dockerin domain of the
catalytic subunits and the iterated cohesin domains of CipA
(43). Targeting of the cellulosome to its cellulose
substrate is accomplished primarily by the cellulose-binding domain
(CBD) of CipA.
In this paper, we report the structure of the C. thermocellum F7 cellobiohydrolase gene cbhA and the
encoded cellulase, CbhA. This enzyme, formerly designated CBH3, has
been characterized as a cellobiohydrolase by its ability to hydrolyze
crystalline cellulose, yielding cellobiose as the only degradation
product (36, 44, 49). It will be shown that CbhA has a
highly complex multidomain structure containing, in addition to an
Ig-like domain and a catalytic domain homologous to cellulase family
E1, two distinct CBDs, a duplicated Fn3-like module, and a dockerin
domain. Evidence identifying CbhA as cellulosomal component S3 is
presented.
 |
MATERIALS AND METHODS |
Bacterial strains and growth conditions.
Escherichia
coli TG1 harboring recombinant plasmid pCU303 or pCU304
(49) was aerated at 37°C in Luria broth supplemented with
ampicillin (0.1 mg/ml). C. thermocellum F7 (obtained from the Institute of Microbial Biochemistry and Physiology, RAS, Puschino, Moscow Region, Russia) was grown under strict anaerobiosis at 60°C in
GS-2 medium (21).
Sequence analysis.
The DNA sequence was determined from
supercoiled double-stranded plasmid DNA for both strands by using the
Sequenase kit (Pharmacia) for extension of 5' biotinylated primers. DNA
fragments were detected with a GATC 1500 Direct-Blotting-Electrophoresis apparatus (GATC, Konstanz, Germany)
using streptavidin-conjugated alkaline phosphatase and nitroblue
tetrazolium-5-bromo-4-chloro-3-indolylphosphate (Serva) as the
chromogenic substrate. Sequence data were analyzed with the DNASIS
software package (Hitachi Software Engineering). Multiple sequence
alignments were carried out by the CLUSTAL procedure (19).
Hydrophobic cluster analysis (HCA) was performed by the method of
Gaboriaud et al., employing a simplified two-dimensional sequence
representation (18). To define hydrophobic clusters, F, I,
L, M, V, W, and Y were considered hydrophobic amino acids. Alanine is
considered hydrophobic only within a hydrophobic cluster. To evaluate
the correspondence between two HCA patterns, a matching score [(2
CR × 100)/(RC1 + RC2)] was calculated, where RC1 and RC2
are the numbers of aligned hydrophobic residues in sequences 1 and 2, respectively, and CR is the total number of matching residues.
Purification of truncated CbhA protein.
A 5-liter culture of
E. coli TG1(pCU304) was harvested by centrifugation, washed
with 50 mM phosphate-citrate (PC) buffer (pH 6.3), suspended in 200 ml
of buffer containing 2 mM phenylmethylsulfonyl fluoride, and sonicated
by using an ultrasonic disintegrator (MSE). Cell extracts were heated
for 30 min at 60°C and centrifuged (10,000 × g, 20 min). The cleared crude extract was precipitated with ammonium sulfate
(60% saturation). The precipitate was collected by centrifugation and
dissolved in 50 ml of PC buffer.
Column chromatography was performed at room temperature with a
fast-performance liquid chromatography system (Pharmacia). Aliquots
(1.5 ml) were loaded on a 20- by 900-mm Toyopearl HW-60 column
(Toyo-Soda, Shinanyo, Japan) equilibrated with PC buffer and eluted
with the same buffer at a flow rate of 0.7 ml/min. Pooled fractions
with cellobiohydrolase activity were applied to a MonoQ HR 10/10
column. Elution was performed with a linear NaCl gradient (0.0 to 0.4 M) in PC buffer. Fractions containing CbhA, which eluted at 0.3 M NaCl,
were dialyzed, concentrated, and purified to electrophoretic
homogeneity by gel filtration on a Superose-12 HR column (16 by 500 mm).
Enzyme assay.
Cellobiohydrolase activity was assayed at
60°C for 10 min in PC buffer (pH 6.0) by using
p-nitrophenyl-
-D-cellobioside (1 mM) as the
substrate. Reactions were terminated by the addition of 1 M
Na2CO3. One enzyme unit corresponds to the
release of 1 µmol of p-nitrophenol per min.
Preparation of cellulosomes.
A 0.5-liter culture of C. thermocellum F7 was grown for 36 h in GS-2 medium containing
filter paper as a sole carbon source. Cells were harvested by
centrifugation, washed six times with 250 ml of deionized water, and
resuspended in 30 ml of 100 mM acetate buffer (pH 5.7) containing 10 mM
CaCl2, 2 mM EDTA, and 5 mM dithiothreitol. The suspension
was sonicated for 3 min in an MSE ultrasonic disintegrator and dialyzed
at 50°C against acetate buffer to completely hydrolyze the remaining
cellulose fibers (37). After centrifugation (30,000 × g, 30 min), the supernatant was concentrated by
ultrafiltration (XM300 membrane; Amicon) to 1 ml and applied to a
Superose 6 HR 10/30 column (Pharmacia) equilibrated with 50 mM Tris-HCl
(pH 7.5). The purified cellulosomes eluted near the void volume of the
column.
Immunological methods.
Polyclonal antibodies were raised in
white rabbits by infection of 0.25 mg of recombinant CbhA protein in
Freund's complete adjuvant (Amersham). Booster injections were given
after 7 days, and bleeding was performed after 14 days. The serum was
purified by using a serum IgG column and checked for specificity. For
Western blot analysis, sodium dodecyl sulfate (SDS)-12%
polyacrylamide gel electrophoresis (PAGE) slabs of purified
cellulosomes were blotted onto a nitrocellulose membrane. The
replicates were incubated with anti-CbhA rabbit serum and subjected to
immunostaining using donkey anti-rabbit serum conjugated to horseradish
peroxidase (Amersham) and 4-chloro-1-naphthol as a chromogenic
substrate.
Protein cleavage, isolation of peptides, and sequencing of
peptides and N termini.
Cellulosomal proteins (100 µg) were
separated by SDS-10% PAGE and stained with Coomassie blue. The band
corresponding to subunit S3 was cut out and incubated with 2 µg of
endoproteinase LysC (Boehringer) in 200 µl of 0.1 M Tris-HCl (pH 8.5)
for 6 h at 37°C. The peptide mixture was separated by
reversed-phase high-performance liquid chromatography on a Supersphere
60 RP select B column (Merck) at a flow rate of 0.3 ml/min. Solvent A
was 0.1% trifluoroacetic acid, and solvent B was 0.1% trifluoroacetic
acid in acetonitrile. The gradient of 0 to 70% solvent B was run in 70 min. Selected peptide-containing fractions were subjected to automated
sequencing. N-terminal amino acid sequences were determined by Edman
degradation using a Procise 492 protein sequencer (Applied Biosystems).
The phenylthiohydantoin derivatives were identified by reversed-phase high-performance liquid chromatography.
Nucleotide sequence accession number.
The nucleotide and
amino acid sequences reported in this study have been submitted to
GenBank under accession no. X80993.
 |
RESULTS |
Nucleotide sequence of the cbhA gene.
The
recombinant plasmid pCU303 carries a 10.7-kb insert of C. thermocellum including the cellobiohydrolase gene cbhA.
EcoRI digestion of pCU303 resulted in the deletion of a 7.3-kb DNA
segment, yielding plasmid pCU304 (49). Sequencing the insert
of pCU304 revealed that EcoRI cleavage had removed the
5'-end portion of the cbhA gene, leading to the production
of a truncated enzyme species still exhibiting cellobiohydrolase
activity. Therefore, the cbhA sequence was completed by
sequencing the corresponding region of pCU303 by using specific
oligonucleotide primers.
The sequenced region (4,183 bp) contained only one long open reading
frame (ORF) of 3,690 nucleotides encoding a protein of
1,230 amino
acids (Fig.
1). The putative initiation
codon ATG
was preceded at a spacing of 6 bp by a
potential ribosome-binding
site with a calculated free energy of
Shine-Dalgarno base pairing
of

66.5 kJ/mol. The ochre stop codon at
position 4129 is followed
by another in-frame ochre stop codon at
position 4153. As observed
previously for other
C. thermocellum genes (
1), the coding
sequence and its
flanking regions differed markedly in their G+C
contents (43.0 and
29.7%, respectively). A palindromic sequence
with a free energy for
RNA hairpin formation of

81.2 kJ/mol is
located immediately
downstream of the ORF. This dyad symmetry
element, which is followed by
a run of 5 T's, might function as
a factor-independent transcription
terminator. Sequence inspection
did not reveal any consensus promoter
sequence recognized by bacterial
RNA polymerases.

View larger version (73K):
[in this window]
[in a new window]

View larger version (76K):
[in this window]
[in a new window]
|
FIG. 1.
Nucleotide and deduced amino acid sequences of the
cbhA gene. The potential ribosome-binding site (SD) is in
boldface type and underlined. A palindrome is indicated by arrows
facing each other. The putative leader sequence is indicated by italic
type. The segments encoding the different regions of CbhA are indicated
by boxes of different patterns: , CBD family IV;
, Ig-like domain; , catalytic
domain; , Fn3-like domain; , CBD
family III; , dockerin domain. The underlined amino
acids were determined for cellulosomal protein S3 by liquid-phase
sequencing.
|
|
Multidomain structure of CbhA.
Analysis of the amino acid
sequence of CbhA derived from the nucleotide sequence revealed a
multidomain structure of unexpected complexity (Fig. 1). Most
structural elements could be readily identified by sequence comparison.
Thus, the N-terminal sequence exhibits the typical features of a
bacterial signal peptide required for protein secretion (50)
with a predicted cleavage site between position 27 (Ala) and position
28 (Leu). Removal of the signal peptide yields a mature protein of
1,203 amino acids with a molecular weight of 135,139.
The central region of CbhA contains the catalytic domain, which is
homologous to cellulase subfamily E1 (
46). It exhibits
38 to
40% sequence identity with the catalytic domains of a
carboxmethylcellulase
from
Pseudomonas fluorescens
(
14) and a group of cellulases
from gram-positive bacteria
with high G+C contents, including
endoglucanase E1 from
Thermomonospora fusca (
26), endoglucanase
CenC
from
Cellulomonas fimi (
9), and endoglucanase
Cel1 from
Streptomyces reticuli (
42). On the
other hand, only 20 to 22%
sequence identity was observed between the
catalytic domain of
CbhA and the
C. thermocellum cellulases
CelD (
24) and CelJ (
1),
two other members of
cellulase subfamily E1. As observed for all
enzymes of this subfamily,
the catalytic domain of CbhA is preceded
by an Ig-like

-barrel
domain of unknown function (
27).
The catalytic core region of CbhA is flanked by two distinct CBDs. The
N-terminal domain is homologous to family IV substrate-binding
domains
of bacterial cellulases and endo-1,3-

-glucanases (Fig.
2), whereas the C-terminal domain is a
member of CBD family III
(Fig.
3). Both
domains consist of two antiparallel

-sheets with
the topology of a
jelly role

-sandwich (
23,
48). Substrate
binding is
mediated by a strip of highly conserved aromatic residues
flanked by
polar hydrogen-bonding groups. The family III CBD of
the
C. thermocellum scaffoldin CipA also contains a Ca
2+
binding site (
48), which seems to be present in all members
of this family.

View larger version (106K):
[in this window]
[in a new window]
|
FIG. 2.
Alignment of amino acid sequences of family IV CBDs of
bacterial cellulases and endo-1,3- -glucanases. Abbreviations and
accession numbers: Cth-LicA, C. thermocellum LicA, X89732;
Tne-LamA, Thermotoga neapolitana LamA (54),
Z47974; Cth-CbhA, C. thermocellum CbhA; Cce-CelE,
Clostridium cellulolyticum CelE (2), Q46002;
Cfi-CenC, Cellulomonas fimi CenC (9), P14090;
Sre-Cel1, S. reticuli Cel1 (42), Q05156; Tfu-E1,
Thermomonospora fusca E1 (26), Q08166. Shaded
boxes highlight positions where residues are conserved in five or more
family members, including those of CbhA. The conserved aromatic
residues are indicated by asterisks. All sequences are numbered from
Met-1.
|
|

View larger version (94K):
[in this window]
[in a new window]
|
FIG. 3.
Alignment of amino acid sequences of selected CBDs from
family III. Abbreviations and accession numbers: Cth-CbhA, C. thermocellum CbhA; Cth-CipA, C. thermocellum CipA
(12), X67506; Csa-CelB, Caldicellulosiruptor
saccharolyticus CelB (41), X13602; Cst-CelZ,
Clostridium stercorarium CelZ (20), X55299;
Cth-CelI, C. thermocellum CelI (17), L04735;
Bla-CelA, Bacillus lautus CelA (15), M76588;
Eca-CelV, Erwinia carotovora CelV (33), X79241.
Shaded boxes highlight positions where residues are conserved in four
or more family members, including those of CbhA. The conserved aromatic
residues and the residues that are implicated in Ca2+
binding are indicated by asterisks and solid triangles, respectively.
All sequences are numbered from Met-1.
|
|
Identification of a novel Fn3-like domain.
Inspection of the
protein joining the CbhA catalytic domain and the family III CBD
sequence by HCA (13, 29) revealed the presence of a repeated
domain (Fig. 4). Although the aligned
sequences exhibit only 26% identity, their HCA matching score is 80%,
which is considered strong evidence for sequence homology
(13). The duplicated domain showed no obvious homology to
other noncatalytic domains but appeared to be distantly related to the
Fn3-like domain of T. fusca endoglucanase E1 and T. fusca exoglucanase E4. Although the similarity is barely
detectable on the amino acid sequence level, high-accuracy
secondary-structure prediction (11) suggests a
-sheet
topology strikingly similar to that of Fn3 modules (Fig. 5). The duplicated CbhA domain also
resembles Fn3-like domains in amino acid composition, exhibiting an
increased content of valine and hydroxylated aliphatic amino acids
(data not shown).

View larger version (73K):
[in this window]
[in a new window]
|
FIG. 4.
HCA plots of CbhA amino acid positions 825 to 912 (A)
and 914 to 1000 (B). Hydrophobic amino acids are shown as gray circles
with conserved positions highlighted in dark gray. Proline residues are
shown as black circles, and other helix-breaking amino acids (D, G, S,
and N) found predominantly in loop regions are shown as white
circles.
|
|

View larger version (21K):
[in this window]
[in a new window]
|
FIG. 5.
Secondary-structure prediction of Fn3-like modules.
Abbreviations and accession numbers: Tfu-E4, Thermomonospora
fusca exoglucanase E4 (26), L20093; Hum-Fib, human
fibronectin (34), P02751. Other abbreviations are as
described in the legend to Fig. 2. Secondary-structure states of amino
acids were predicted by the PREDATOR program (11) and are
represented by an "E" (extended or sheet) and a dash (coil). The
seven antiparallel -strands of the 10th Fn3 module of human
fibronectin are designated by the letters A to F (34) at the
bottom of the figure. All sequences are numbered from Met-1.
|
|
Cellulosomal localization.
The C-terminal segment of CbhA is
made up of the highly conserved duplicated sequence of 24 amino acids
constituting the cellulosomal dockerin module. This domain is separated
from the family III CBD by an acidic 14-amino-acid linker sequence
consisting of repeats of the tripeptide Pro-Glu-Glu (Fig. 1). The
presence of a dockerin domain strongly suggests that CbhA is a
cellulosome constituent. To confirm this conjecture, polyclonal
antibodies were raised against the truncated CbhA protein expressed by
pCU304. It should be pointed out that the cloned insert terminates at
the EcoRI site at nucleotide 3688 and thus lacks the
C-terminal portion of CbhA including the dockerin domain. The truncated
protein is further processed upon expression in E. coli,
yielding an enzymatically active protein of 80 kDa, which presumably
consists only of the catalytic domain and the flanking Ig- and Fn3-like
domains. Western blot analysis indicated that two cellulosomal
proteins, S3 and S5, with apparent molecular masses of 150 and 98 kDa,
respectively, strongly reacted with anti-CbhA antibodies (Fig.
6). The comparison of molecular masses
indicates that CbhA might correspond to subunit S3. The identity of
CbhA and S3 was established by amino acid sequence analysis. Due to
blockage of the N terminus, partial sequences were determined upon
cleavage of S3 with endoprotease LysC. The two peptide sequences
obtained (see Fig. 1) were fully consistent with the deduced amino acid
sequence of CbhA.

View larger version (62K):
[in this window]
[in a new window]
|
FIG. 6.
Detection of CbhA in the cellulosome of C. thermocellum. (Left panel) Western blot analysis of cellulosomal
proteins detected with a polyclonal antibody raised against truncated
CbhA; (right panel) SDS-PAGE of cellulosomal proteins stained with
Coomassie brilliant blue. Cellulosomal subunits S1 to S14 are indicated
with corresponding molecular masses.
|
|
 |
DISCUSSION |
Although numerous cellulolytic and hemicellulolytic C. thermocellum enzymes are considered cellulosome constituents due
to the presence of a dockerin domain, only a few have been correlated with cellulosomal subunits (1, 8, 16, 38, 53). Western blot
analysis and amino acid sequence determination clearly demonstrate that
CbhA is identical to cellulosomal protein S3. It should be noted that
the molecular mass of S3 (150 kDa) determined by SDS-PAGE is
considerably larger than the mass of CbhA (135 kDa) deduced from the
DNA sequence. This difference might be due to an atypical electrophoretic mobility of CbhA possibly caused by the highly acidic
linker sequence (positions 1150 to 1162). It was observed previously
that the presence of linker regions rich in glutamic acid residues can
retard migration of multidomain proteins in SDS-PAGE (31).
The immunological data suggest that S5 is either a structurally related
protein or a proteolytic degradation product of CbhA. Formation of S5
by proteolytic cleavage of CbhA is consistent with the N-terminal
sequence of S5 from C. thermocellum JW20 (8). The
reported sequence LEDKS(S)KLPDYKNDL(L)YE is nearly
identical to the N terminus of mature CbhA predicted from the sequence
data (Fig. 1). Minor sequence variations could reflect differences between C. thermocellum JW20 and F7. The size of S5 (98 kDa)
indicates that the proteolytic cleavage between the two Fn3-like
modules of CbhA might have occurred. Truncation of the C-terminal
dockerin domain during cellulosome dissociation has recently been
reported for subunit S8, which corresponds to cellobiohydrolase CelS
(8).
The identification of CbhA and CelS as cellulosomal constituents S3 and
S8, respectively, implies that the cellulosome contains at least two
exoglucanases and refutes the early concept that the cellulosome
consists entirely of endoglucanase activities (35). Both
exoglucanases have been characterized as cellobiohydrolases (28,
36, 44, 49) but belong to different cellulase families. CbhA is a
member of cellulase family E1, whereas CelS belongs to family L
(46). The two enzymes also differ strikingly in their domain
structures. CelS is less complex and consists of a catalytic domain and
a C-terminal dockerin domain (51). Due to its lack of CBDs,
CelS requires the presence of CipA for the efficient hydrolysis of
crystalline cellulose. It has been proposed that both proteins interact
synergistically in an enzyme (CelS)-anchor (CipA) manner (32,
52).
The multidomain structure of CbhA was unexpected, considering that the
cellulosome is mainly an assembly of catalytic subunits, which are
organized for concerted action and targeted to the insoluble substrate
by the CipA protein (4). In particular, the presence of both
an N-terminal and a C-terminal CBD is apparently redundant. However, it
should be kept in mind that family IV and family III CBDs differ
strikingly in their substrate specificity. Whereas family III domains
bind specifically to crystalline cellulose, family IV domains bind with
approximately equal affinities to amorphous cellulose,
cellooligopentaose, and mixed-linkage
-glucans (22,
47). Conceivably, this binding site could participate directly in
cellulose degradation by keeping the amorphous region in a
noncrystalline state suitable for enzymatic hydrolysis. On the other
hand, the C-terminal family III CBD might assist CipA in attaching the
cellulosome to crystalline cellulose fibers.
The role of the other noncatalytic domains of CbhA is less obvious. It
should be noted that the Ig-like
-barrel domain has so far been
found only in members of cellulase family E1, where it is always
positioned at the N terminus of the catalytic domain (see Fig. 7). It
might therefore be specifically involved in the folding and/or
stabilization of the catalytic
6/
6-barrel
domain of this cellulase subfamily. In contrast, Fn3-like domains are found in various unrelated prokaryotic depolymerases in widely different arrangements (30). It is therefore likely that
these domains have a similar function in prokaryotic and in eukaryotic exoproteins, namely, adhesion to cell surface receptors. In the case of
CbhA, this original function became redundant upon integration of the
enzyme into the cellulosomal complex. On the other hand, duplication of
the module might be required for correct positioning of the C-terminal
CBD with respect to the catalytic domain. Structure analysis has shown
that such module pairs do not simply function as flexible spacer
elements but adopt defined relative orientations stabilized by specific
intermodule interactions (7, 39). This change of function
could explain the sequence divergence from other prokaryotic Fn3-like
modules.
Comparison of the domain structures of various other cellulases of
subfamily E1 indicates striking similarities between CbhA and a group
of enzymes from gram-positive bacteria with high G+C content (Fig.
7). In particular, it is obvious that the
endoglucanase E1 of T. fusca has a similar functional design
consisting of an N-terminal and central catalytic region involved in
cellulose hydrolysis and a C-terminal portion involved in substrate and cell surface adherence. Average linkage cluster analysis of the N-terminal family IV CBD and the catalytic domains suggests coevolution of these two domains (Fig. 8).
Apparently, this domain array arose by a rare recombination event and
spread by horizontal transfer among gram-positive cellulolytic
bacteria. Contrary to the proposed rearrangements of eukaryotic
multidomain proteins due to exon shuffling, such domain arrays appear
to be remarkably stable in bacteria, reflecting a fundamental
difference in gene structure.

View larger version (32K):
[in this window]
[in a new window]
|
FIG. 7.
Comparison of the domain structure of cellulases of
subfamily E1. Domains and regions showing significant similarity are
indicated by the same pattern. Abbreviations and accession numbers:
Cth-CelJ, C. thermocellum CelJ (1), D83704;
Cth-CelD, C. thermocellum CelD (24), X04584;
Pfl-EglA, P. fluorescens EglA (14), X12570;
Fsu-EgB, Fibrobacter succinogenes EgB (6),
L14436; Bfi-CelD, Butyrivibrio fibrisolvens CelD
(5), X55732; aa, amino acids. Other abbreviations for the
enzymes are described in the legend to Fig. 2.
|
|

View larger version (16K):
[in this window]
[in a new window]
|
FIG. 8.
Average linkage cluster analysis. Similar amino acids
were grouped by the classification of Risler et al. (40).
The dendrogram was derived from pairwise similarity scores in
accordance with the UPGMA (unweighted pair group maximum averages)
method (45). Abbreviations for enzymes are described in the
legends to Fig. 2 and 7.
|
|
 |
ACKNOWLEDGMENTS |
This work was supported in part by a grant from the Deutsche
Forschungsgemeinschaft (SFB 145), by a NATO Collaborative Research grant (HTECH. CRG 930993), by a grant from the Volkswagenstiftung, and
by a grant from the Russian Foundation of Basic Research.
 |
FOOTNOTES |
*
Corresponding author. Mailing address: Institute for
Microbiology, Technical University Munich, Arcisstrasse 21, D-80290
Munich, Germany. Phone: (089) 2892-2372. Fax: (089) 2892-2360. E-mail: zverlov{at}biol.chemie.tu-muenchen.de.
 |
REFERENCES |
| 1.
|
Ahsan, M. M.,
T. Kimura,
S. Karita,
K. Sakka, and K. Ohmiya.
1996.
Cloning, DNA sequencing, and expression of the gene encoding Clostridium thermocellum cellulase CelJ, the largest catalytic component of the cellulosome.
J. Bacteriol.
178:5732-5740[Abstract/Free Full Text].
|
| 2.
|
Bagnara-Tardif, C.,
C. Gaudin,
A. Belaich,
P. Hoest,
T. Citard, and J. P. Belaich.
1992.
Sequence analysis of a gene cluster encoding cellulases from Clostridium cellulolyticum.
Gene
119:17-28[Medline].
|
| 3.
|
Bayer, E. A.,
E. Morag, and R. Lamed.
1994.
The cellulosome a treasure-trove for biotechnology.
Trends Biotechnol.
12:379-386[Medline].
|
| 4.
|
Béguin, P., and M. Lemaire.
1996.
The cellulosome: an exocellular, multiprotein complex specialized in cellulose degradation.
Crit. Rev. Biochem. Mol. Biol.
31:201-236[Medline].
|
| 5.
|
Berger, E.,
W. A. Jones,
D. T. Jones, and D. R. Woods.
1990.
Sequencing and expression of a cellodextrinase (ced1) gene from Butyrivibrio fibrisolvens H17c cloned in Escherichia coli.
Mol. Gen. Genet.
223:310-318[Medline].
|
| 6.
|
Broussolle, V.,
E. Forano,
G. Gaudet, and Y. Ribot.
1994.
Gene sequence and analysis of protein domains of EGB, a novel family E endoglucanase from Fibrobacter succinogenes S58.
FEMS Microbiol. Lett.
124:439-447[Medline].
|
| 7.
|
Campbell, I. D., and C. Spitzfaden.
1994.
Building proteins with fibronectin type III modules.
Structure
2:333-337[Medline].
|
| 8.
|
Choi, S. K., and L. G. Ljungdahl.
1996.
Dissociation of the cellulosome of Clostridium thermocellum in the presence of ethylenediaminetetraacetic acid occurs with the formation of truncated polypeptides.
Biochemistry
35:4897-4905[Medline].
|
| 9.
|
Coutinho, J. B.,
B. Moser,
D. G. Kilburn,
R. A. J. Warren, and R. C. Miller.
1991.
Nucleotide sequence of the endoglucanase C gene (cenC) of Cellulomonas fimi, its high-level expression in Escherichia coli, and characterization of its products.
Mol. Microbiol.
5:1221-1233[Medline].
|
| 10.
|
Doolittle, R. F.
1995.
The multiplicity of domains in proteins.
Annu. Rev. Biochem.
64:287-314[Medline].
|
| 11.
|
Frishman, D., and P. Argos.
1997.
Seventy-five percent accuracy in protein secondary structure prediction.
Proteins Struct. Funct. Genet.
27:329-335.
[Medline] |
| 12.
|
Fujino, T.,
P. Beguin, and J. P. Aubert.
1993.
Organization of a Clostridium thermocellum gene cluster encoding the cellulosomal scaffolding protein CipA and a protein possibly involved in attachment of the cellulosome to the cell surface.
J. Bacteriol.
175:1891-1899[Abstract/Free Full Text].
|
| 13.
|
Gaboriaud, C.,
V. Bissery,
T. Benchetrit, and J. P. Mornon.
1987.
Hydrophobic cluster analysis: an efficient new way to compare and analyse amino acid sequences.
FEBS Lett.
224:149-155[Medline].
|
| 14.
|
Hall, J., and H. J. Gilbert.
1988.
The nucleotide sequence of a carboxycellulase gene from Pseudomonas fluorescens subsp. cellulosa.
Mol. Gen. Genet.
213:112-117[Medline].
|
| 15.
|
Hansen, C. K.,
B. Diderichsen, and P. L. Jorgensen.
1992.
celA from Bacillus lautus PL236 encodes a novel cellulose-binding endo- -1,4-glucanase.
J. Bacteriol.
174:3522-3531[Abstract/Free Full Text].
|
| 16.
|
Hayashi, H.,
K.-I. Takagi,
M. Fukumura,
T. Kimura,
S. Karita,
K. Sakka, and K. Ohmiya.
1997.
Sequence of xynC and properties of XynC, a major component of the Clostridium thermocellum cellulosome.
J. Bacteriol.
179:4246-4253[Abstract/Free Full Text].
|
| 17.
|
Hazlewood, G. P.,
K. Davidson,
J. I. Laurie,
N. S. Huskisson, and H. J. Gilbert.
1993.
Gene sequence and properties of CelI, a family E endoglucanase from Clostridium thermocellum.
J. Gen. Microbiol.
139:307-316[Medline].
|
| 18.
|
Henrissat, B.,
Y. Popineau, and Y. Kader.
1988.
Hydrophobic cluster analysis of plant protein sequences. A domain homology between storage and lipid transfer proteins.
Biochem. J.
255:901-905[Medline].
|
| 19.
|
Higgins, D. G.,
J. D. Thompson, and T. J. Bibson.
1996.
Using CLUSTAL for multiple sequence alignments.
Methods Enzymol.
266:388-402.
|
| 20.
|
Jauris, S.,
K. P. Ruecknagel,
W. H. Schwarz,
P. Kratzsch,
K. Bronnenmeier, and W. L. Staudenbauer.
1990.
Sequence analysis of the Clostridium stercorarium celZ gene encoding a thermoactive cellulase (Avicelase I): identification of catalytic and cellulose-binding domains.
Mol. Gen. Genet.
223:258-267[Medline].
|
| 21.
|
Johnson, E. A.,
A. Madia, and A. L. Demain.
1981.
Chemically defined minimal medium for growth of the anaerobic cellulolytic thermophile Clostridium thermocellum.
Appl. Environ. Microbiol.
41:1060-1062[Abstract/Free Full Text].
|
| 22.
|
Johnson, P. E.,
P. Tomme,
M. D. Joshi, and L. P. McIntosh.
1996.
Interaction of soluble cellooligosaccharides with the N-terminal cellulose-binding domain of Cellulomonas fimi CenC. 2. NMR and ultraviolet absorption spectroscopy.
Biochemistry
35:13895-13906[Medline].
|
| 23.
|
Johnson, P. E.,
M. D. Joshi,
P. Tomme,
D. G. Kilburn, and L. P. McIntosh.
1996.
Structure of the N-terminal cellulose-binding domain of Cellulomonas fimi CenC determined by nuclear magnetic resonance spectroscopy.
Biochemistry
35:14381-14394[Medline].
|
| 24.
|
Joliff, G.,
P. Béguin, and J. P. Aubert.
1986.
Nucleotide sequence of the cellulase gene celD encoding endoglucanase D of Clostridium thermocellum.
Nucleic Acids Res.
14:8605-8613[Abstract/Free Full Text].
|
| 25.
|
Jones, E. Y.
1993.
The immunoglobulin superfamily.
Curr. Opin. Struct. Biol.
3:846-852.
|
| 26.
|
Jung, E. D.,
G. Lao,
D. Irwin,
B. K. Barr,
A. Benjamin, and D. B. Wilson.
1993.
DNA sequences and expression in Streptomyces lividans of an exoglucanase gene and an endoglucanase gene from Thermomonospora fusca.
Appl. Environ. Microbiol.
59:3032-3043[Abstract/Free Full Text].
|
| 27.
|
Juy, M.,
A. G. Amit,
P. M. Alzari,
R. J. Poljak,
M. Claeyssens,
P. Béguin, and J. P. Aubert.
1992.
Three-dimensional structure of a thermostable bacterial cellulase.
Nature
357:89-91.
|
| 28.
|
Kruus, K.,
W. K. Wang, and J. H. D. Wu.
1995.
Exoglucanase activities of the recombinant Clostridium thermocellum CelS, a major cellulosome component.
J. Bacteriol.
177:1641-1644[Abstract/Free Full Text].
|
| 29.
|
Lemesle-Varloot, L.,
B. Henrissat,
C. Garboriaud,
V. Bissery,
A. Morgat, and J. P. Mornon.
1990.
Hydrophobic cluster analysis: procedures to derive structural and functional information from 2-D-representation of protein sequences.
Biochimie
72:555-574[Medline].
|
| 30.
|
Little, E.,
P. Bork, and R. F. Doolittle.
1994.
Tracing the spread of fibronectin type III domains in bacterial glycohydrolases.
J. Mol. Evol.
39:631-643[Medline].
|
| 31.
|
Lück, A.,
J. D'Haese, and H. Hinssen.
1995.
A gelsolin-related protein from lobster muscle: cloning, sequence analysis and expression.
Biochem. J.
305:767-775.
|
| 32.
|
Lytle, B.,
C. Myers,
K. Kruus, and J. H. D. Wu.
1996.
Interactions of the CelS binding ligand with various receptor domains of the Clostridium thermocellum cellulosomal scaffolding protein, CipA.
J. Bacteriol.
178:1200-1203[Abstract/Free Full Text].
|
| 33.
|
Mae, A.,
R. Heikinheimo, and E. T. Palva.
1995.
Structure and regulation of the Erwinia carotovora subspecies carotovora SCC3193 cellulase gene celV1 and the role of cellulase in phytopathogenicity.
Mol. Cen. Genet.
247:17-26.
|
| 34.
|
Main, L. M.,
T. S. Harvey,
M. Baron,
J. Boyd, and I. D. Campbell.
1992.
The three-dimensional structure of the tenth type III module of fibronectin: an insight into RGD-mediated interactions.
Cell
71:671-678[Medline].
|
| 35.
|
Mayer, F.,
M. P. Coughlan,
Y. Mori, and L. G. Ljungdahl.
1987.
Macromolecular organization of the cellulolytic enzyme complex of Clostridium thermocellum as revealed by electron microscopy.
Appl. Environ. Microbiol.
53:2785-2792[Abstract/Free Full Text].
|
| 36.
|
Mel'nik, M. S.,
M. L. Rabinovich, and I. V. Voznyi.
1991.
Cellobiohydrolase from Clostridium thermocellum, synthesized by a recombinant E. coli strain.
Biokhimiya
56:1787-1797.
|
| 37.
|
Morag, E.,
E. A. Bayer, and R. Lamed.
1992.
Affinity digestion for the near-total recovery of purified cellulosome from Clostridium thermocellum.
Enzyme Microb. Technol.
14:289-292.
|
| 38.
|
Morag, E.,
E. A. Bayer,
G. P. Hazlewood,
H. J. Gilbert, and R. Lamed.
1993.
Cellulase SS (CelS) is synonymous with the major cellobiohydrolase (subunit S8) from the cellulosome of Clostridium thermocellum.
Appl. Biochem. Biotechnol.
43:147-151[Medline].
|
| 39.
|
Potts, J. R., and I. D. Campbell.
1996.
Structure and function of fibronectin modules.
Matrix Biol.
15:313-320[Medline].
|
| 40.
|
Risler, J. L.,
M. O. Delorme,
H. Delacroix, and A. Henaut.
1988.
Amino acid substitutions in structurally related proteins. A pattern recognition approach. Determination of a new and efficient scoring matrix.
J. Mol. Biol.
204:1019-1029[Medline].
|
| 41.
|
Saul, D. J.,
L. C. Williams,
R. A. Grayling,
L. W. Chamley,
D. R. Love, and P. L. Bergquist.
1990.
celB, a gene coding for a bifunctional cellulase from the extreme thermophile Caldocellum saccharolyticum.
Appl. Environ. Microbiol.
56:3117-3124[Abstract/Free Full Text].
|
| 42.
|
Schlochtermeier, A.,
S. Walter,
J. Schröder,
M. Moorman, and H. Schrempf.
1992.
The gene encoding the cellulase (Avicelase) Cel1 from Streptomyces reticuli and analysis of protein domains.
Mol. Microbiol.
6:3611-3621[Medline].
|
| 43.
|
Shimon, L. J. W.,
E. A. Bayer,
E. Morag,
R. Lamed,
S. Yaron,
Y. Shoham, and F. Frolow.
1997.
A cohesin domain from Clostridium thermocellum: the crystal structure provides new insights into cellulosome assembly.
Structure
5:381-390[Medline].
|
| 44.
|
Singh, R. N., and V. K. Akimenko.
1993.
Isolation of a cellobiohydrolase of Clostridium thermocellum capable of degrading natural crystalline substrates.
Biochem. Biophys. Res. Commun.
192:1123-1130[Medline].
|
| 45.
|
Sokal, R. R., and P. H. A. Sneath.
1963.
In
Principles of numerical taxonomy.
Freeman, San Francisco, Calif.
|
| 46.
|
Tomme, P.,
R. A. J. Warren, and N. R. Gilkes.
1995.
Cellulose hydrolysis by bacteria and fungi.
Adv. Microb. Physiol.
37:1-81[Medline].
|
| 47.
|
Tomme, P.,
L. Creagh,
D. Kilburn, and C. Haynes.
1996.
Interaction of polysaccharides with the N-terminal cellulose-binding domain of Cellulomonas fimi CenC. 1. Binding specificity and calorimetric analysis.
Biochemistry
35:13885-13894[Medline].
|
| 48.
|
Tormo, J.,
R. Lamed,
A. J. Chirino,
E. Morag,
E. A. Bayer,
Y. Shoham, and T. A. Steitz.
1996.
Crystal structure of a bacterial family-III cellulose-binding domain: a general mechanism for attachment to cellulose.
EMBO J.
15:5739-5751[Medline].
|
| 49.
|
Tuka, K.,
V. V. Zverlov,
B. K. Bumazkin,
G. A. Velikodvorskaya, and A. Y. Strongin.
1990.
Cloning and expression of Clostridium thermocellum genes coding for thermostable exoglucanases (cellobiohydrolases) in Escherichia coli cells.
Biochem. Biophys. Res. Commun.
169:1055-1060[Medline].
|
| 50.
|
von Heijne, G.
1985.
Signal sequences. The limits of variation.
J. Mol. Biol.
184:99-105[Medline].
|
| 51.
|
Wang, W. K.,
K. Kruus, and J. H. D. Wu.
1993.
Cloning and DNA sequence of the gene coding for Clostridium thermocellum cellulase SS (CelS), a major cellulosome component.
J. Bacteriol.
175:1293-1302[Abstract/Free Full Text].
|
| 52.
|
Wu, J. H. D.,
W. H. Orme-Johnson, and A. L. Demain.
1988.
Two components of an extracellular protein aggregate of Clostridium thermocellum together degrade crystalline cellulose.
Biochemistry
27:1703-1709.
|
| 53.
|
Zverlov, V. V.,
K. P. Fuchs,
W. H. Schwarz, and G. Velikodvorskaya.
1994.
Purification and cellulosomal localization of Clostridium thermocellum mixed linkage -glukanase LicB (1,3-1,4- -D-glucanase).
Biotechnol. Lett.
16:29-34.
|
| 54.
|
Zverlov, V. V.,
I. Y. Volkov,
T. V. Velikodvorskaya, and W. H. Schwarz.
1997.
Highly thermostable endo-1,3- -glucanase (laminarinase) LamA from Thermotoga neapolitana: nucleotide sequence of the gene and characterization of the recombinant gene product.
Microbiology
143:1701-1708[Abstract/Free Full Text].
|
J Bacteriol, June 1998, p. 3091-3099, Vol. 180, No. 12
0021-9193/98/$04.00+0
Copyright © 1998, American Society for Microbiology. All rights reserved.
This article has been cited by other articles:
-
Gold, N. D., Martin, V. J. J.
(2007). Global View of the Clostridium thermocellum Cellulosome Revealed by Quantitative Proteomic Analysis. J. Bacteriol.
189: 6787-6795
[Abstract]
[Full Text]
-
Taylor, L. E. II, Henrissat, B., Coutinho, P. M., Ekborg, N. A., Hutcheson, S. W., Weiner, R. M.
(2006). Complete Cellulase System in the Marine Bacterium Saccharophagus degradans Strain 2-40T. J. Bacteriol.
188: 3849-3861
[Abstract]
[Full Text]
-
Demain, A. L., Newcomb, M., Wu, J. H. D.
(2005). Cellulase, Clostridia, and Ethanol. Microbiol. Mol. Biol. Rev.
69: 124-154
[Abstract]
[Full Text]
-
Kataeva, I. A., Uversky, V. N., Brewer, J. M., Schubot, F., Rose, J. P., Wang, B.-C., Ljungdahl, L. G.
(2004). Interactions between immunoglobulin-like and catalytic modules in Clostridium thermocellum cellulosomal cellobiohydrolase CbhA. Protein Eng Des Sel
17: 759-769
[Abstract]
[Full Text]
-
Kosugi, A., Amano, Y., Murashima, K., Doi, R. H.
(2004). Hydrophilic Domains of Scaffolding Protein CbpA Promote Glycosyl Hydrolase Activity and Localization of Cellulosomes to the Cell Surface of Clostridium cellulovorans. J. Bacteriol.
186: 6351-6359
[Abstract]
[Full Text]
-
Devillard, E., Goodheart, D. B., Karnati, S. K. R., Bayer, E. A., Lamed, R., Miron, J., Nelson, K. E., Morrison, M.
(2004). Ruminococcus albus 8 Mutants Defective in Cellulose Degradation Are Deficient in Two Processive Endocellulases, Cel48A and Cel9B, Both of Which Possess a Novel Modular Architecture. J. Bacteriol.
186: 136-145
[Abstract]
[Full Text]
-
Kataeva, I. A., Seidel, R. D. III, Shah, A., West, L. T., Li, X.-L., Ljungdahl, L. G.
(2002). The Fibronectin Type 3-Like Repeat from the Clostridium thermocellum Cellobiohydrolase CbhA Promotes Hydrolysis of Cellulose by Modifying Its Surface. Appl. Environ. Microbiol.
68: 4292-4300
[Abstract]
[Full Text]
-
Lynd, L. R., Weimer, P. J., van Zyl, W. H., Pretorius, I. S.
(2002). Microbial Cellulose Utilization: Fundamentals and Biotechnology. Microbiol. Mol. Biol. Rev.
66: 506-577
[Abstract]
[Full Text]
-
Zverlov, V. V., Velikodvorskaya, G. A., Schwarz, W. H.
(2002). A newly described cellulosomal cellobiohydrolase, CelO, from Clostridium thermocellum: investigation of the exo-mode of hydrolysis, and binding capacity to crystalline cellulose. Microbiology
148: 247-255
[Abstract]
[Full Text]
-
Rincon, M. T., McCrae, S. I., Kirby, J., Scott, K. P., Flint, H. J.
(2001). EndB, a Multidomain Family 44 Cellulase from Ruminococcus flavefaciens 17, Binds to Cellulose via a Novel Cellulose-Binding Module and to Another R. flavefaciens Protein via a Dockerin Domain. Appl. Environ. Microbiol.
67: 4426-4431
[Abstract]
[Full Text]
-
Kataeva, I. A., Seidel, R. D. III, Li, X.-L., Ljungdahl, L. G.
(2001). Properties and Mutation Analysis of the CelK Cellulose-Binding Domain from the Clostridium thermocellum Cellulosome. J. Bacteriol.
183: 1552-1559
[Abstract]
[Full Text]
-
Kataeva, I. A., Blum, D. L., Li, X.-L., Ljungdahl, L. G.
(2001). Do domain interactions of glycosyl hydrolases from Clostridium thermocellum contribute to protein thermostability?. Protein Eng Des Sel
14: 167-172
[Abstract]
[Full Text]
-
Mai, V., Wiegel, J.
(2000). Advances in Development of a Genetic System for Thermoanaerobacterium spp.: Expression of Genes Encoding Hydrolytic Enzymes, Development of a Second Shuttle Vector, and Integration of Genes into the Chromosome. Appl. Environ. Microbiol.
66: 4817-4821
[Abstract]
[Full Text]
-
Gaudin, C., Belaich, A., Champ, S., Belaich, J.-P.
(2000). CelE, a Multidomain Cellulase from Clostridium cellulolyticum: a Key Enzyme in the Cellulosome?. J. Bacteriol.
182: 1910-1915
[Abstract]
[Full Text]
-
Kataeva, I., Li, X.-L., Chen, H., Choi, S.-K., Ljungdahl, L. G.
(1999). Cloning and Sequence Analysis of a New Cellulase Gene Encoding CelK, a Major Cellulosome Component of Clostridium thermocellum: Evidence for Gene Duplication and Recombination. J. Bacteriol.
181: 5288-5295
[Abstract]
[Full Text]