ABSTRACT
The gene encoding a hyperthermostable type II pullulanase produced by Thermococcus hydrothermalis (Th-Apu) has been isolated. Analysis of a total of 5.2 kb of genomic DNA has revealed the presence of three open reading frames, one of which (apuA) encodes the pullulanase. This enzyme is composed of 1,339 amino acid residues and exhibits a multidomain structure. In addition to a typical N-terminal signal peptide, Th-Apu possesses a catalytic domain, a domain bearing S-layer homology-like motifs, a Thr-rich region, and a potential C-terminal transmembrane domain. The presence of these noncatalytic domains suggests that Th-Apu may be anchored to the cell surface and be O glycosylated.
Pullulanases (EC 3.2.1.41 ) cleave the α-1,6 glucosidic bonds in pullulan (2) and can be classified according to the additional ability (type II) or inability (type I) to degrade the α-1,4 glucosidic bonds of other polysaccharides (1). In recent years, a considerable number of type II pullulanases (also termed amylopullulanases) have been isolated from a wide variety of microorganisms, particularly thermophilic ones, since scientific interest in this class of enzymes is motivated by industrial applications (11, 18, 23, 25, 26, 39). To date, four members of the archaeal orderThermococcales, Pyrococcus furiosus,Thermococcus litoralis (3), P. woesei(35), and T. celer (4), have been described as pullulanase producers. In each case, the pullulanase activity was described as type II and was localized in the culture medium, indicating that these enzymes are secreted. The enzymes fromP. furiosus and T. litoralis have been described as glycosylated and may be present in multiple forms. In addition, the production of these enzymes appears to be inducible, with malto-oligosaccharides being the principal inducers. Among these enzymes, only those of P. furiosus and P. woeseihave been studied at the genetic level. In both cases, a gene encoding a type II pullulanase has been isolated, cloned, and characterized, although the actual nucleotide sequence of the P. woeseitype II pullulanase is not available for analysis.
Previously, we have characterized the amylolytic activities of T. hydrothermalis and have shown that this archaebacterium produces several amylolytic enzymes, including at least one pullulanase (Th-Apu) (13, 19). Purification and characterization of this enzyme have revealed that it is highly thermostable and is capable of hydrolyzing the α-1,6 glucosidic bonds of pullulan, producing maltotriose, and both the α-1,6 and α-1,4 glucosidic bonds of starch, producing oligosaccharides (degree of polymerization, as low as 4) (12). In this work, we report the isolation and characterization of the gene encoding Th-Apu, as well as two other partial open reading frames (ORFs) which encode mal-like operon elements.
Isolation and characterization of the Th-Apu gene and its surrounding DNA environment.In order to isolate the gene encoding Th-Apu, we first performed N-terminal microsequencing of the native protein purified from the medium of a T. hydrothermalisAL662 culture. This yielded a sequence of 11 amino acid residues (AEPKPLNVIIV) which was compared to the N-terminal sequence of the mature pullulanase from P. furiosus (Pf-Apu) (12). The high degree of similarity between these two small regions (9 of 11 amino acid residues are identical) indicated that the two proteins might share a high degree of overall homology. Therefore, using the primary sequence of Pf-Apu, three degenerate oligonucleotides were designed for use in standard PCRs using T. hydrothermalis AL662 genomic DNA as the template. In this way, specific sequences were amplified and then sequenced, thus providing the necessary sequence information to perform genome crawling in the upstream and downstream directions. Likewise, a 5,200-bp sequence was generated which contains the entire Th-Apu gene (designatedapuA) flanked by two partial ORFs. Translation of the partial ORFs generated two polypeptides of 118 and 261 amino acids, respectively. The first of these proteins is highly similar (81% identity) to the C-terminal part of the hypothetical MalG-like protein from P. furiosus (10) and indeed displays identity with several known MalG proteins, including that ofEscherichia coli (41%) (9). Similarly, the second polypeptide exhibits strong identity with several MalK proteins, including those of E. coli (57% identity) (14),Salmonella typhimurium (56% identity), andEnterobacter aerogenes (57% identity) (8), leading us to the conclusion that this protein may be a MalK homologue. The apparently promoterless apuA gene, surrounded by these mal-like genes, is composed of 4,011 bp, 2,562 bp of which show some similarity (∼40%) to the sequence encoding Pf-Apu. However, the P. furiosus and T. hydrothermalis sequences diverge after this point. Whereas the Pf-Apu ORF ends at this position with a stop codon, the T. hydrothermalis ORF has an additional 1,449 bp before the stop codon (TGA) is found.
Primary structure analysis of Th-Apu.Translation ofapuA revealed a polypeptide sequence comprised of 1,339 amino acids which are arranged into several well-defined domains (Fig.1). Like Pf-Apu, Th-Apu possesses an N-terminal sequence (amino acids 2 to 27) which bears the characteristics of a signal peptide (33). The ensuing sequence (PUL), composed of 827 amino acids, presents a high degree of sequence identity (79%) with Pf-Apu and, as such, probably represents the catalytic domain. Beyond this point, three other domain types, absent in Pf-Apu, can be distinguished. The first of these consists of two almost identical sequence repeats (R1 and R2), each of which possesses a repeated motif which, when compared to the PROSITE database, shows similarity to the S-layer homology (SLH) signature. Alignment of these repeated motifs with the prealigned sequences of ProDom domain 1624 (7) confirmed this finding while indicating that there is no complete consensus between the Th-Apu motifs (Fig. 2). Indeed, the latter part of the SLH consensus motif, defined as ILLA TS R ASQ EDQ, where consensus amino acid residues are in boldface (24) is absent in the first SLH motif of both Th-Apu S-layer motif-bearing domains (SLD1 and SLD2), while being barely distinguishable in the second SLH motif of each of these domains. Although this sequence appears to be a well-conserved element of the SLH motif, its absence does not necessarily indicate that the Th-Apu-derived domains are not SLDs, since a large sequence diversity exists, even among the known eubacterial SLH motifs. Indeed, a similar, incomplete SLH motif in the SlpA protein of Clostridium thermocellum has recently been described (21). Furthermore, apart from this apparent divergence from the canonical sequence, the alignment revealed that the thermococcal domains possess most of the other pertinent features of a eubacterial SLD. In contrast, comparison of the SLH motifs of Th-Apu with those of several proteins from methanogenic archaebacteria (6, 28, 29) failed to reveal anything more than superficial similarity.
Multidomain structure of Th-Apu and comparison of this protein to Pf-Apu. R1 and R2 are two homologous, 230-amino-acid (aa) residue repeats. SLD1 and SLD2 (checkered) are SLH motif-bearing domains (containing approximately two and a half SLH motifs per domain). SP is a signal peptide, PET is a Thr-rich domain, and TM is a hydrophobic C-terminal domain.
Alignment of the two Th-APu SLDs with a variety of eubacterial SLDs. The alignment is based on ProDom domain 1624 (INRA, Toulouse, France), which was constructed from 15 eubacterial sequences aligned with the MultiAlin alignment tool. The proteins used to define ProDom 1624 are as follows: ANCA_CLOTM, cellulosome anchoring protein from C. thermocellum; SLP1_CLOTM and SLP2_CLOTM, SLPs fromC. thermocellum; GUN_BACS6, endoglucanase fromBacillus sp. strain KSM-635; SLAP_ACEKI, S-layer glycoprotein from A. kivui; SLPH_BACBR, SLP fromBacillus brevis; SLPM_BACBR, middle cell wall protein fromB. brevis; APU_THETU, type II pullulanase from T. thermosulfurigenes; XYNX_CLOTM, exoglucanase from C. thermocellum; XYNA_THESA, endoxylanase A fromThermoanaerobacterium saccharolyticum; SLAP_BACSH, SLP fromBacillus sphaericus; OMPA_THEMA, outer membrane protein A from Thermotoga maritima; SLAP_THETH, SLP fromThermoanaerobacterium ethanolicus; SLAP_BACAN, SLP protein from Bacillus anthracis; SLAP_BACLI, SLP from Bacillus licheniformis. The Th-APu sequences were manually fitted to the ProDom alignment by introducing uniform gaps where necessary. White characters on a black background indicate areas where sequence identity is equal to or greater than 50%, while gray shading indicates areas where sequence similarity is equal to or greater than 50%. The groups of similar residues were considered to be I, L, V, M; D, E; N, Q; S, T; R, K; and F, Y, W. Dots indicate gaps in the sequences. SLD1_THYDRO and SLD2_THYDRO are the two Th-Apu SLDs.
The presence of two SLDs in Th-Apu is interesting since such domains have already been observed in a variety of extracellular polysaccharide-degrading enzymes, including three from T. thermosulfurigenes, a type II pullulanase (26), a polygalacturonate hydrolase and a xylanase (27), a xylanase from C. thermocellum (16), and an α-amylase–pullulanase from Bacillus sp. (18). Previous studies have indicated that these enzyme-associated SLDs may be responsible for the anchoring of proteins to the cell surface, possibly by interacting with peptidoglycan in certain bacteria or with the S-layer itself via SLH-SLH interactions (22, 31, 36).
A remarkable Thr-rich region (PET) follows domains R1 and R2 and precedes a smaller domain (TM) which shows a striking resemblance to an inverted lipoprotein signal peptide (34). Database searching revealed that this arrangement is very similar to that found in the C termini of S-layer proteins (SLPs) of Haloferax halobium(17) and H. volcanii (38). In these proteins, the Thr-rich regions have been shown to be targets for intensive O glycosylation (glucose-galactose disaccharides) while the shorter, hydrophobic domains have been proposed as transmembrane anchors. In Th-Apu, therefore, it is possible that the PET domain is O glycosylated and that the shorter, C-terminal TM domain serves as either a transmembrane anchor or, indeed, as a hydrophobic cell wall anchor, rather like the C-terminal domain of the SLP ofCorynebacterium glutamicum (5).
Interestingly, several SLH-bearing enzymes exhibit variations of the Thr-rich region which are usually described as linker regions. Indeed, in the case of the T. thermosulfurigenes pullulanase, it has been suggested that this region, due to the probable extended, flexible nature of its structure, would allow optimal orientation of the enzyme’s catalytic site toward the substrate (26). In another carbohydrate-degrading enzyme, the glucoamylase fromAspergillus awamori, it has been shown that this region is subject to O glycosylation (hypermannosylation) and that its presence in this enzyme increases the efficiency of degradation of insoluble starch granules (37). In the same way, the glycosylated-linker regions of two glycanases from Cellulomonas fimi appear to increase the affinity of these enzymes for microcrystalline cellulose and may favor the disruption of cellulose fibers (32).
Expression of the catalytic domain in E. coli.Having failed in our attempts to obtain a plasmid containing an intactapuA gene, we constructed a plasmid containing only the PUL domain of Th-Apu (pAPUΔ1). Expression trials using pAPUΔ1 led to the production of a significant amount of recombinant protein inE. coli cells which was present in a soluble form in the cytoplasmic fraction after cell lysis. Crude purification of this protein could be achieved by a simple heat treatment (80°C, 30 min) which provoked the precipitation of most of the contaminating proteins. Examination of the recombinant protein by sodium dodecyl sulfate-polyacrylamide gel electrophoresis revealed the presence of two isoforms (the two species have identical N-terminal sequences) which exhibit molecular masses of approximately 90 and 85 kDa. Zymogram analysis was then employed, the results of which showed that both species possess thermostable pullulanolytic activity, indicating that the catalytic determinants are indeed present within the PUL domain.
Conclusion.On the basis of our results, we propose that Th-Apu forms part of a maltose transport operon different from the one previously described for T. litoralis (15). This enzyme is probably secreted in T. hydrothermalis cells, although rather than being released into the extracellular medium, Th-APu may be anchored to the cell membrane or another hydrophobic component of the cell envelope via its TM domain, with its catalytic domain exposed to the surrounding medium. By analogy to other Pro-Thr-rich domains, it can be assumed that the PET domain has a rather extended, perhaps flexible, conformation which would be susceptible to proteolytic digestion (30). Thus, intense O glycosylation of this domain would reduce the vulnerability of this structure. With regard to its function, the PET domain may serve as a linker and/or perform a role similar to that of the Thr-rich region of the glucoamylase from A. awamori or may fulfill a cell wall-anchoring role rather like the Gly/Ser-rich region in SbsB (36). The role of the Th-Apu SLDs is unclear since, to our knowledge, nothing is known concerning the cell envelope of T. hydrothermalis. However, by analogy to other proteins which exhibit SLH motifs, we speculate that these domains interact with a hitherto unidentified component of the cell envelope.
Clearly, the idea that Th-Apu is localized at the cell surface appears to contradict previous experimental data from chemostat cultures which have indicated that Th-Apu is completely secreted and released into the extracellular medium. However, as others have already suggested (20, 26), it is possible that cellular attachment may be a transient state prior to complete release of the protein and/or that the prevailing conditions of a chemostat culture may provoke the degradation of the cell envelope, thus eliminating certain protein attachment points (e.g., SLH anchoring determinants).
Nucleotide sequence accession number.The GenBank accession number for the 5.2-kb DNA fragment described here is AF113969 .
ACKNOWLEDGMENTS
We thank Jean-Claude Pernollet and Jean-Claude Huet for their assistance with the N-terminal protein sequencing and Béatrice Hermant for her general technical assistance.
This research forms part of a scientific program which was funded by the Europol’Agro consortium.
FOOTNOTES
- Received 28 December 1998.
- Accepted 19 March 1999.
- Copyright © 1999 American Society for Microbiology