Previous Article | Next Article ![]()
Journal of Bacteriology, February 2003, p. 714-725, Vol. 185, No. 3
0021-9193/03/$08.00+0 DOI: 10.1128/JB.185.3.714-725.2003
Copyright © 2003, American Society for Microbiology. All Rights Reserved.
Laboratoire de Bioénergétique et Ingénierie des Protéines, UPR 9036-CNRS, 13402 Marseille Cedex 20,1 Université de Provence, 13331 Marseille Cedex 03, France2
Received 29 July 2002/ Accepted 4 November 2002
|
|
|---|
|
|
|---|
Only a few IS elements have been described so far in clostridia; four have been reported in Clostridium perfringens (6), one has been reported in Clostridium beijerinckii NCIMB 8052 (30), and one has been reported in the cellulolytic bacterium Clostridium thermocellum (39). Clostridium cellulolyticum is a mesophilic anaerobic cellulolytic bacterium which secretes enzymatic complexes called cellulosomes (5, 17). These complexes are composed of several enzymes, most of which are cellulases (Cel proteins); these enzymes are anchored to a large scaffolding protein (160 kDa) that lacks catalytic activity, designated CipC (17, 36). Many of the cel genes form a large cluster spanning 24 kb beginning with the cipC gene (3, 38). Functional studies of the cellulosomes have been restricted so far to biochemical studies of recombinant subunits overproduced in E. coli (4, 16, 19, 38). Gene transfer techniques were recently developed for C. cellulolyticum (25, 44) and were used to modify its fermentation pathways (22). However, no description of a mutagenic system allowing random or targeted mutagenesis has been described so far for this bacterium. Naturally occurring ISs would therefore be valuable tools for developing a transposon-based mutagenesis system.
In this paper, we describe two different IS elements which were found in the cipC gene of various isolated clones of C. cellulolyticum ATCC 35319. The features of these sequences, which are designated ISCce1 and ISCce2, are described below, and their membership in various IS families is discussed.
|
|
|---|
was used as the recipient strain for the recombinant plasmids (derivatives of pUC18, pUC19, or pGEM-T-Easy). It was grown at 37°C in Luria-Bertani medium supplemented with ampicillin (100 µg/ml) (23). |
View this table: [in a new window] |
TABLE 1. Bacterial strains and plasmids
|
The other Clostridium strains were grown as previously described (10, 24, 26, 29, 31, 41).
DNA manipulations. Chromosomal DNA was obtained from the various Clostridium strains by using a genomic DNA purification kit (Promega). DNA from Clostridium cellulovorans was a generous gift from R. H. Doi (University of California, Davis). Large-scale plasmid purification from E. coli and small-scale plasmid purification from E. coli were performed by using kits from Qiagen and Promega. Restriction enzymes and DNA-modifying enzymes were purchased from Promega and Roche Applied Science and were used as recommended by the manufacturers. DNA sequencing was performed by Genome Express (Grenoble, France).
Primers and probes. Primers were purchased from MWGAG-Biotech (Courtaboeuf, France) (Table 2). Primers c1 and c2 were used to amplify sequences that disrupt the cipC gene in the cipCMut1 and the cipCMut2 strains. Primers A, B, C, D, E, F, G, and H were used in inverse PCR experiments to analyze insertion sites of the two IS elements (see below and Fig. 2). The various primers were also used for sequencing ISCce1 and ISCce2.
|
View this table: [in a new window] |
TABLE 2. Primer sequences
|
![]() View larger version (15K): [in a new window] |
FIG. 2. Maps of the cipC gene in the wild-type strain (A) and mutant strains cipCMut2 (B) and cipCMut1 (C). orf1 (solid box) and orf2 (gray box) encode the putative transposases of ISCce1 and ISCce2, respectively. The vertical boxes represent insertion sites of ISCce1 in the cipC gene (cross-hatched box) and of ISCce2 in ISCce1 (solid box). The positions of primers A, B, C, D, E, F, G, H, c1, and c2 are indicated by arrows. Probe 2 and probe 3 are internal probes of ISCce1 and ISCce2, respectively. Restriction sites: EV, EcoRV; P, PstI; N, NdeI; HIII, HindIII; EI, EcoRI.
|
![]() View larger version (27K): [in a new window] |
FIG. 1. Discovery of insertion elements in C. cellulolyticum. (A) Map of the cipC gene disrupted by an insertion element (shaded box). The encoded domains are indicated above the gene (SS, signal sequence; CBM3, carbohydrate binding module of family 3; X2, unknown function module of family 2; C1 to C8, cohesin modules). (B) Southern blot analysis of PvuII-digested genomic DNA from various strains. The blot was probed with PCR digoxigenin-labeled probe 1 (part 1) and with the cipC probe (part 2). Lane WT, wild type; lane 1, cipCMut1; lane 2, cipCMut2. Sizes (in kilobase pairs) are indicated on the left.
|
Inverse PCR.
DNA sequences flanking the IS elements in the genome of C. cellulolyticum were amplified by inverse PCR (34). Total chromosomal DNA of the cipCMut1 strain was digested by a restriction enzyme cutting the IS element once near the unknown sequence. To determine the sequences flanking ISCce1 at its left junction, DNA was digested with PstI or NdeI (Fig. 2C). The resulting fragments were ligated and used as templates for PCR amplification with divergent primers A and B. The inverse PCR products were then purified by using a Qiaex II gel purification kit (Qiagen) and were ligated to linearized pGEM-T-Easy vector. Ligation mixtures were used to transform competent E. coli DH5
cells. Ampicillin-resistant colonies were isolated. Plasmid DNA was purified and subjected to restriction analysis. Depending on the orientation of the insert, the T7 or SP6 primer was used to sequence the junction. The same protocol was used to determine the right junctions of ISCce1, but in this case the DNA was digested with NdeI and the PCR was carried out with primers G and H (Fig. 2C). In order to find the right junctions of combined ISs, the DNA was digested with EcoRI or HindIII, and the PCR was performed with primers E and H (Fig. 2C). Fragments flanking ISCce2 at its left junctions were synthesized with primers C and D from ligated EcoRV DNA fragments. Right junctions of ISCce2 were analyzed from inverse PCR products obtained with primers E and F by using ligated EcoRI or HindIII fragments as the templates.
Computer analysis. Nucleotide sequences were analyzed with the DNASIS program, version 2.1. The BLAST program (1) was used for a homology search of the nucleotide and protein sequences in the GenBank and IS (www-is.biotoul.fr) databases. The DNA binding motifs in the proteins were predicted by using the Helix-Turn-Helix program (13). Multiple-sequence alignments, obtained with ClustalW, version 1.7 (45), were used to construct phylogenetic trees with Phylo_win (18).
Nucleotide sequence accession numbers. The nucleotide sequences of the IS elements described here, ISCce1 and ISCce2, have been deposited in the GenBank database under accession numbers AY130778 and AY130779, respectively.
|
|
|---|
and screened by colony hybridization with a 285-bp probe complementary to the 3' end of cipC (35). The 3.8-kb PvuII fragment inserted into the pH62 recombinant plasmid of one of the selected clones was found to contain an internal part of the cipC gene interrupted by a 2,659-bp sequence. This sequence contained two open reading frames (ORFs) encoding proteins which showed significant levels of identity with transposases. The cipC gene was disrupted at the beginning of the sequence encoding cohesin 7 of the scaffolding protein CipC (Fig. 1A). A liquid culture of the strain used to construct the DNA library was plated onto solid medium. DNA was extracted from 17 isolated colonies, digested with PvuII, and subjected to a Southern blot analysis by using probe 1 (Fig. 1A). Based on comparisons between the various patterns obtained, three major groups were distinguished. The patterns of the first group were comparable to the pattern obtained with the DNA purified from the reference strain (ATCC 35319). Probe 1 hybridized with many fragments (Fig. 1B, part 1, lane WT), indicating that many copies of this DNA sequence were inserted at various loci on the chromosome. In the second group, an additional fragment was detected in the DNA; this fragment was 3.9 kb long (a representative example is shown in Fig. 1B, part 1, lane 1). In the third group, the additional fragment was 2.5 kb long (Fig. 1B, part 1, lane 2). When the cipC probe was used, 1.2- and 1.8-kb fragments were detected in the DNA of the wild-type strain (Fig. 1B, part 2); the probe hybridized with two PvuII fragments of the cipC gene. The 1.8-kb fragment was also detected in lanes containing DNA from strains 1 and 2, but the 1.2-kb fragment was not detected. Instead of the latter fragment, 3.9- and 2.5-kb fragments were detected in strains 1 and 2, respectively (Fig. 1B, part 2), which indicated that the cipC gene had been disrupted in these two strains (which were designated cipCMut1 and cipCMut2).
Structural analysis of the ISs. Genomic DNAs from strains cipCMut1 and cipCMut2 were used to amplify the sequences inserted into cipC. The PCR fragments were synthesized by using primers c1 and c2 designed from the cohesin 6- and C-terminal X2 module-encoding sequences, respectively (Fig. 2). The sequences were analyzed after cloning into the pGEM-T-Easy vector by using primers c1, c2, B, C, D, and E (Fig. 2B and C). A 2,659-bp sequence was inserted into cipC in cipCMut1 DNA; this sequence was identical to the copy found in the PvuII fragment of pH62 and inserted at the same place (Fig. 1A and 2C). Another insertion element, which was 1,292 bp long, was found in the same place in cipCMut2 DNA. This element corresponds to the 2,659-bp sequence with its internal part deleted (Fig. 2B).
The 1,292-bp DNA sequence contained only one ORF (orf1), which was 1,047 bp long and spanned almost the entire element (Fig. 3); it was flanked by 23-bp IRs with six mismatches. The left and right IRs were found at 50 and 26 bp of the extremities, respectively. Many characteristics typical of an IS were observed: (i) insertion of the 1,292-bp sequence yielded an 8-bp direct repeat (DR) footprint in the target sequence and (ii) the large ORF encoded a 348-amino-acid protein (40.2 kDa), designated TnpA1, which exhibited significant levels of identity with a hypothetical protein (designated ORF1Ap [see below]) from Actinobacillus pleuropneumoniae (57%) (2), with a putative transposase (TnpWe) from a Wolbachia endosymbiont of Drosophila simulans (57%) (accession number AAK69114), and with the putative proteins ID317 (55%) (21) and ChnZ (62%) (9) from Bradyrhizobium japonicum and Acinetobacter sp. strain SE19, respectively. It also exhibited some identity with the transposases encoded by many ISs belonging to the IS481 family (20% to 38%) and with one IS (ISPg5 [7]) belonging to the IS3 family (28%) (8). A multiple alignment of some of these proteins (Fig. 4) enlightened many stretches of conserved amino acids, including three aspartic residues and two glutamic residues. Three of these amino acids might constitute the DDE catalytic triad (Fig. 4). Protein structure predictions suggested that an
-helix-turn-helix (HTH) DNA binding motif was present at the N terminus of TnpA1 (Fig. 3). Based on all these criteria, the element was designated ISCce1, although it does not have any canonical IRs at its extremities.
![]() View larger version (62K): [in a new window] |
FIG. 3. Nucleotide sequence of ISCce1 and predicted amino acid sequence of transposase TnpA1. The putative ribosome binding site sequence is enclosed in a box. The ORF encoding transposase TnpA1 starts with an ATG codon at position 93 (boldface type) and ends with a TAA stop codon at position 1137 (asterisks). The deduced amino acid sequence is indicated under the corresponding nucleotide sequence. The putative ribosome binding site sequence is boxed. A palindromic sequence is overlined with arrows. The 8-bp duplicated sequence at the insertion site of ISCce2 into ISCce1 is underlined. Imperfect terminal IRs are indicated by incomplete arrows, with mismatches indicated by interruptions. The potential HTH DNA binding motif in the TnpA1 amino acid sequence is indicated by boldface type.
|
![]() View larger version (99K): [in a new window] |
FIG. 4. Alignment of TnpA1(from ISCce1) with ORF1Ap (A. pleuropneumoniae), ID317 (B. japonicum), TnpWe (Wolbachia endosymbiont of D. simulans), ChnZ (Acinetobacter sp. strain SE19), and proteins encoded by IS1121 (Clavibacter michiganensis) and ISPg5 (Porphyromonas gingivalis). A black background indicates identical amino acids, a dark gray background indicates very similar amino acids, and a light gray background indicates weakly similar amino acids. Conserved aspartic acid (D) and glutamic acid (E) residues are indicated below the alignment.
|
![]() View larger version (67K): [in a new window] |
FIG. 5. Nucleotide sequence of ISCce2 and predicted amino acid sequence of the transposase TnpA2. The ORF encoding transposase TnpA2 starts with an ATG codon at position 109 (boldface type) and ends with a TAA stop codon at position 1303 (asterisks). The putative ribosome binding site sequence is enclosed in a box. The potential HTH DNA binding motif and the potential DDE catalytic triad motif in the TnpA2 amino acid sequence are indicated by boldface type and by circled residues, respectively. Terminal IRs are indicated by incomplete arrows, with mismatches indicated by interruptions.
|
Insertion sites of ISCce1 and ISCce2 in the genome of C. cellulolyticum. To determine the sequences flanking ISCce1 in the genome of C. cellulolyticum, chromosomal DNA of the cipCMut1 strain was digested with NdeI, EcoRI, HindIII, or PstI. Ligated fragments were used as templates for inverse PCR performed with primers A and B for the left junctions and primers G and H or primers E and H for the right junctions (Fig. 2C). PCR products were cloned into the pGEM-T-Easy vector and then analyzed by sequencing. Fourteen plasmids harboring fragments different from the cipC gene and disrupted by ISCce1 were obtained. Three of the ISCce1 copies were combined with an ISCce2 copy. The ISCce1 target sequences found were AT rich, but no consensus sequence could be identified (Table 3).
|
View this table: [in a new window] |
TABLE 3. Frequency of base occurrence at each position of the ISCce1 insertion sites
|
In other respects, partial sequences determined for various copies of each IS were found to be exactly identical to the sequences of the copies initially found in the cipC gene (data not shown). Furthermore, the noncanonical location of IRs, which were found near the end of ISCce1, was confirmed by looking at other copies cloned from genomic DNA.
Close physical link between ISCce1 and ISCce2. ISCce2 was initially found within ISCce1 in the cipCMut1 strain. In addition, the sequence analysis of the IS junctions showed that at least four of the seven ISCce2 copies found in this strain were inserted into ISCce1. In order to examine this unusual association in another way, the Southern blot which was previously probed with a combined IS element was reprobed with probe 2 to detect only ISCce1 (Fig. 2B) and then with probe 3 to detect only ISCce2 (Fig. 2C). The four bands detected in the wild-type lane with probe 3 could be superimposed on the bands revealed with probe 2 (Fig. 6B). Furthermore, the pattern obtained with the cipCMut1 DNA (when probe 3 was used) contained three bands which were absent in the wild-type DNA lane (Fig. 6B); two of these bands could be superimposed with those detected in the same lane with probe 2 (Fig. 6A). These findings suggest that many PvuII fragments contain both IS elements, although the possibility that some of the common bands may have resulted from comigration of two different fragments, each containing one of the two ISs, cannot be ruled out. Nevertheless, of the seven ISCce2 copies studied, at least four were found to be inserted into ISCce1 (see above). Taken together, these results strongly suggest that ISCce1 is a hot spot for the transposition of ISCce2.
![]() View larger version (43K): [in a new window] |
FIG. 6. Distribution and association of ISCce1 and ISCce2 in C. cellulolyticum: Southern blot analysis of PvuII-digested genomic DNAs of the wild type (lane WT), cipCMut1 (lane 1), and cipCMut2 (lane 2) and of EcoRI-digested cipCMut1 DNA (lane 1E). Blots were hybridized with the ISCce1 probe (A) and with the ISCce2 probe (B). Superimposable bands detected in both hybridization experiments for all strains are indicated in panel B by arrowheads. The bands indicated by circles are superimposable bands obtained only with the mutant strains. Sizes (in kilobase pairs) are indicated on the left and on the right.
|
To determine whether ISCce1 and ISCce2 were present in other clostridia, DNAs extracted from some selected stains were digested by PvuII, electrophoresed, and hybridized with probe 2 or probe 3. No fragments homologous to ISCce1 were detected in any of the strains tested with probe 2 in hybridization experiments carried out at 68 or 55°C (data not shown). However, probe 3 hybridized with one or two DNA fragments from Clostridium cellobioparum, Clostridium papyrosolvens, Clostridium termitidis, and C. cellulovorans in the experiments carried out at 55°C (Fig. 7). These fragments may therefore have low levels of sequence similarity with ISCce2.
![]() View larger version (95K): [in a new window] |
FIG. 7. Distribution of ISCce2 in Clostridium strains: Southern blotting of PvuII-digested DNAs of clostridia hybridized with probe 3 at 55°C. Lane 1, C. cellobioparum; lane 2, C. papyrosolvens; lane 3, C. termitidis; lane 4, C. cellulovorans; lane 5, C. saccharobutylicum; lane 6, C. thermocellum; lane 7, C. acetobutylicum.
|
|
|
|---|
The nucleotide sequence of ISCce1 has only one long ORF, which codes for a putative transposase designated TnpA1. The deduced amino acid sequence showed significant levels of identity with proteins ORF1Ap (2), ID317 (21), TnpWe (accession number AAK69114), and ChnZ (9) and with some transposases encoded by ISs belonging to the IS481 family (33) and by ISPg5 belonging to the IS3 family (8) (Fig. 4).
To investigate the evolutionary relationships between TnpA1 and these proteins, phylogenetic trees were drawn. The Phylo_Win program (18) was applied to the multiple-sequence alignment of TnpA1, ORF1Ap, ID317, TnpWe, and 11 transposases encoded by elements belonging to the IS481 family (8), which was obtained with the program CLUSTAL W (45). The resulting tree shows that TnpA1 (ISCce1), ORF1Ap, ID317, and TnpWe may constitute a group (Fig. 8A). This group is separated from the 11 members of the IS481 family, and the existence of a relationship between the two groups was not confirmed by a high bootstrap confidence level. A similar analysis was carried out with TnpA1 (ISCce1), ORF1Ap, ID317, TnpWe, ChnZ, and six transposases encoded by ISs belonging to the IS3 family (all of which belong to the IS3 group) (8). As ChnZ is only 219 amino acids long, the tree was generated from part of the multiple-sequence alignment. As described above, TnpA1 (ISCce1), ORF1Ap, ID317, TnpWe, and ChnZ constitute a group separated from the group formed by the IS3 family members (Fig. 8B). Again, the relationship between the ISCce1 group and the IS3 group was not confirmed by a high bootstrap confidence level. Like TnpA1, the other proteins in the ISCce1 group also show some homology with the transposases of the IS481 family and with some members of the IS3 family (data not shown).
![]() View larger version (30K): [in a new window] |
FIG. 8. Phylogenetic trees showing relationships between ISCce1 and members of the IS481 and IS3 families. The trees were constructed from a multiple-sequence alignment (ClustalW) (http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_clustalw.html) of transposases of ISs and proteins showing identity with TnpA1 (ISCce1) by using the neighbor-joining method (Phylo_Win) (http://biom1.univ-lyon1.fr/software/phylowin.html). The circled numbers are the percentages of support (bootstrap values) for individual nodes in the tree obtained by performing 100 replicate searches. Only values higher than 65% are indicated. A percentage of accepted mutation distance is indicated above each clade. (A) Tree constructed from the entire multiple-sequence alignment of TnpA1 (ISCce1), 11 members of the IS481 family, and other homologous proteins. The accession numbers for the members of the IS481 family used are as follows: ISA0963_6, AE000986; ISSco2, AL10949; ISMav2, AF286339; ISVch1, AF034434; ISAni1, X97015; IS1121, AF079817; IS1652, AL109949; IS1002, Z54268; ISBm3, AF047478; IS481, M22031; and ISCgl1, U85507. The accession numbers for the other homologous proteins used are as follows: TnpWe, AAK69114; ID317, AAG60838; and ORF1Ap, S27482. (B) Tree constructed from part of the multiple-sequence alignment of TnpA1, five members of the IS3 family, and other homologous proteins. The accession numbers for the members of the IS3 family used are as follows: ISPg5, AF224744; IS1520, AJ250598; IS981, M33933; IS600, X05992; and IS_LL6, U23813. The accession numbers for the other homologous proteins used (TnpWe, ID317 and ORF1Ap) are as described above; in addition, ChnZ (accession number AAG10024) was used.
|
These results and those of the phylogenetic analysis suggest that the unknown protein from A. pleuropneumoniae (2) (ORF1Ap), protein ID317 from B. japonicum (21), ChnZ from Acinetobacter sp. strain SE19 (9), and the putative transposase from a Wolbachia endosymbiont of D. simulans (accession number AAK96114) might be encoded by complete or truncated ISs. These sequences, along with ISCce1, might form a new group of ISs, which is probably related to the IS481 and the IS3 families. In this group, ISs would (i) exhibit IRs, but not at the extremities of the element; (ii) generate the formation of DRs in the target upon transposition, although this feature could be identified only for ISCce1; and (iii) contain only one ORF encoding a putative transposase. The strict conservation of several D and E residues strongly suggests that the catalytic mechanism of these transposases involves a DDE triad.
The nucleotide sequence of ISCce2 has one large ORF (tnpA2) that putatively encodes a transposase (TnpA2). This protein has significant levels of identity with many transposases belonging to the IS256 family. A phylogenetic tree was generated for 18 IS elements belonging to the IS256 family or showing some identity with members of this family (Fig. 9). This tree shows that ISCce2 might be a member of the IS256 family, but as it is located on a separate clade of the tree, it does not have any close relatives belonging to this family. ISCce2 has all the features of the IS256 family (8): (i) it has IRs at its extremities (35 bp in ISCce2); (ii) it duplicates an 8-bp target site sequence upon transposition; (iii) TnpA2 has a DDE motif that has extended regions similar to that of the transposases of the IS256 family; and (iv) TnpA2 exhibits similarities with the putative MurA gene product of the autonomous mutator element of Zea mays, MuDR (14).
![]() View larger version (39K): [in a new window] |
FIG. 9. Phylogenetic tree of some members of the IS256 family. This tree was constructed from a multiple-sequence alignment (ClustalW) (http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_clustalw.html) of transposases of ISs and proteins showing identity with TnpA2 (ISCce2) by using the neighbor-joining method (Phylo_Win) (http://biom1.univ-lyon1.fr/software/phylowin.html). The circled numbers are the percentages of support (bootstrap values) for individual nodes on the tree obtained by performing 100 replicate searches. Only values higher than 65% are indicated. A percentage of accepted mutation distance is indicated above each clade. The accession numbers for the proteins and ISs are as follows: TnpSm, BAB07803; ISRo1, U70364; IS1245, L33879; IS1553I, NP_338287; Tnp1250b, AF024666; IS1601-A, AAD44203; IS1081, X61270; IS1408, U62766; IS1407, X97307; IS1164, D67027; IS1512, U95314; IS16, U35366; IS256, M18086; IS406, M83145; ISRm5, U08627; ISRm3, M60971; and IS905A, L20851.
|
Since ISCce1 and ISCce2 were isolated after they were inserted into the cipC gene, they are therefore transpositionally active. The use of these ISs for construction of mutagenic tools is interesting. Such tools should allow identification of new relevant genes involved in cellulolysis.
We acknowledge the financial support received from the Centre National de la Recherche Scientifique and Université de Provence, from Conseil Général des Bouches du Rhône, and from Région Provence-Alpes-Côtes d'Azur. H. Maamar received a fellowship from the Tunisian Government.
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»