Previous Article | Next Article ![]()
Journal of Bacteriology, August 2003, p. 4891-4900, Vol. 185, No. 16
0021-9193/03/$08.00+0 DOI: 10.1128/JB.185.16.4891-4900.2003
Copyright © 2003, American Society for Microbiology. All Rights Reserved.
Institute of Molecular and Cellular Biosciences, The University of Tokyo, Bunkyo-ku, Tokyo 113-0032, Japan
Received 2 December 2002/ Accepted 27 May 2003
|
|
|---|
|
|
|---|
-integrase family and in the recombinases of the Hin/Res family, some of which are involved in inversion of a DNA segment. It has been reported that PIV may have part of the catalytic motif in reverse transcriptases (19, 24, 26). PIV has, however, recently been shown to have a D-E-D triad motif corresponding to the catalytic D-D-E motif that is conserved in integrases encoded by retroviruses related to avian sarcoma virus (46). The full genome sequences of Escherichia coli K-12 MG1655 and enterohemorrhagic E. coli (EHEC) O157:H7 have been determined (8, 16, 17, 36). We have been searching for mutations that occur from rearrangements in various E. coli strains, including E. coli C and six ECOR strains from an E. coli collection, focusing on the DNA segment (about 465 kb in length) corresponding to the 0- to 10-min region of the E. coli K-12 map by using PCR with primers that hybridize to the MG1655 sequence at positions spaced in 5-kb intervals. DNA sequencing of the polymorphic fragments generated from the E. coli strains revealed that the polymorphism is due to the presence of mutations, such as insertions, deletions, substitutions, and duplications of a DNA segment. Of these mutations, most insertions were identified by a computer-aided homology search to have homology with known IS elements.
In this study, we report that an E. coli strain, ECOR28, has a repeated sequence with homology to piv genes at three loci in the 0- to 10-min region of the E. coli K-12 map. We show that this sequence is a novel insertion element, named IS621, which does not have terminal inverted repeats and which encodes transposase with partial homology to those encoded by the IS110/IS492 family elements. The N-terminal regions of PIV proteins and transposases encoded by the IS110/IS492 family elements, including IS621, appear to have four acidic amino acid residues constituting a tetrad motif, D-E (or D)-D-D, rather than the triad motif as the catalytic center. IS621, which shows the highest homology to piv, has terminal sequences that have homology to the 26-bp IRs of pilin gene inversion sites, suggesting that IS621 initiates transposition through recognition of their terminal regions and cleavage at the ends by a common mechanism used by PIV to promote inversion at the pilin gene inversion sites. Interestingly, IS621 was found to be present in repetitive extragenic palindromic (REP) sequences located at three loci in the ECOR28 genome. REP sequences are bacterial short repeats, 35 to 40 bp in length, with imperfect palindromic sequences (for a review, see reference 5). In most cases, REP sequences at each locus occur in clusters called bacterial interspersed mosaic elements, which contain 2 to 12 REP sequences, with other short conserved sequences in positions spaced at intervals (5, 13). The number of copies and the arrangement of REP sequences vary among strains (45, 49). We show that IS621 is inserted into the same site in one of two copies of the REP sequences located at each of the three loci identified in the 0- to 10-min region as well as at seven loci identified in other regions of the ECOR28 genome. There are several elements belonging to the IS110/IS492 family which also transpose to specific sites in the repeated sequences, as does IS621. We discuss the possibility that IS621 and other IS110/IS492 family elements recognize a sequence of about 15 bp with the insertion site in the repeated sequences with full or partial homology.
|
|
|---|
Media.
The culture media used were L broth and L-rich broth (51), SOC medium (40), and
-medium (51). The L agar plates used contained 1.5% (wt/vol) agar (Wako) in L broth.
DNA preparation. Genomic DNA was extracted from a 5-ml bacterial culture by the cetyltrimethylammonium bromide-NaCl method described previously (4). Plasmid DNA was extracted from cells cultured at 37°C for 16 h in 3 ml of L broth containing 100 µg of ampicillin/ml by using a Quantum prep kit (Bio-Rad).
PCR. The chemically synthesized oligonucleotide primers used are listed in Table 1. PCR was performed according to the standard protocol in a 25-µl solution containing a 0.4 mM concentration of each deoxyribonucleoside triphosphate, a 0.24 µM concentration of each pair of primers, 1.5 U of LA-Taq DNA polymerase (Takara), and 0.2 µg of genomic DNA as the template. The PCR conditions were as follows: denaturation at 98°C for 40 s, annealing at 55°C for 30 s, and extension at 72°C for 2 min for a total of 30 cycles. PCR was done with a DNA thermal cycler model PJ2000 (Perkin-Elmer). PCR products were electrophoresed in a 1.0% agarose gel (Wako) in TAE buffer (40 mM Tris-acetate, 1.0 mM EDTA [pH 8.0]) at 100 V for 1 h.
|
View this table: [in a new window] |
TABLE 1. Oligonucleotide primers used
|
Purification and cloning of DNA fragments. The PCR-amplified fragments were cut out of an agarose gel, recovered by using a centrifuge tube with a filter (Suprec-01; Takara), and ethanol precipitated. The DNA fragments were cloned by dATP tailing, followed by ligation to a TA cloning vector as follows: dATP tailing was performed in 10 µl of solution containing 1x buffer, 2.5 mM MgCl2, 375 µM deoxyribonucleoside triphosphates, 5 U of LA-Taq DNA polymerase, and 5 µl of the purified DNA fragment at 72°C for 15 to 30 min; ligation was performed in 12 µl of solution containing 2 µl of dATP-tailed DNA solution, 50 ng of pGEM-T easy vector (Promega), and 400 U of T4 DNA ligase at 4°C for 16 to 20 h. The sample DNA was transformed into E. coli strain JM109 by using the method described previously (40). The white colonies were selected on L agar plates containing 100 µg of ampicillin/ml, 0.5 mM IPTG (isopropyl-ß-D-thiogalactopyranoside), and 100 µg of X-Gal (5-bromo-4-chloro-3-indolyl-ß-D-galactopyranoside) per ml.
DNA sequencing. DNA sequencing was performed by the dideoxynucleotide chain termination method with oligonucleotide primers and an ABI BigDye Terminator DNA sequencing kit (Applied Biosystems). The PCR products were purified with a Centri-Sep spin column (Princeton) and analyzed with an ABI 377 DNA sequencer (Applied Biosystems).
Computer analysis. Nucleotide sequences were analyzed using Genetyx-Mac version 10.1 and HarrPlot version 2.0 software. A homology search was performed by using the search engines FASTA (34), BLAST (1), and SSEARCH (33, 43) with the DDBJ homepage (http://www.ddbj.nig.ac.jp). Amino acid sequences of PIV proteins and transposases were aligned with YooEdit version 1.71, Clustal W version 1.7, and SeAl version 1.d1 software. A phylogenetic tree was constructed with Phylip version 3.572, njplot, and TreeviewPPC software. The secondary structures of proteins were analyzed with the software program PSIPRED (http://bioinf.cs.ucl.ac.uk/psipred) (22, 29).
|
|
|---|
![]() View larger version (10K): [in a new window] |
FIG. 1. Physical maps of the DNA segments around the insertion sites of IS621 in the genomes of nine E. coli strains. The solid lines show the sequences of E. coli K-12 MG1655 and EHEC O157:H7. For the other E. coli strains, sequences identified by gel electrophoresis of PCR-amplified fragments and by sequencing are shown. Positions of IS element insertions (IS621, IS4, and IS200) and REP sequences (small rectangles) are shown. IS4 and IS200 have been identified in the regions shown. Numbers with or without K are coordinates given to the E. coli K-12 MG1655 sequence in kilobases or base pairs, respectively. The positions of the primers used for PCR are shown by small arrowheads; the solid arrowheads represent primers that were used to analyze polymorphism of the fragments with or without IS621 or REP. Note that IS621 is present in REP sequences in ECOR28 and that ECOR36 and ECOR46 have no REP sequences.
|
Comparison of the sequences flanking IS621 at each of three loci in ECOR28 with the MG1655 (or O157:H7) sequence having no IS621 showed that a 2-bp sequence CT appeared at the junction regions of IS621 with the target sequence (Fig. 2A). Note that IS621 has no IRs at its termini (Fig. 2A). IS621 has one large open reading frame, 981 bp in length, possibly encoding transposase (Fig. 3A).
![]() View larger version (44K): [in a new window] |
FIG. 2. Nucleotide sequences with IS621 at 10 loci in the ECOR28 chromosome. (A) Nucleotide sequences with IS621 at three loci in the region corresponding to the 0- to 10-min region of the E. coli K-12 map. The nucleotide sequence without IS621 at each position in the E. coli K-12 (MG1655) chromosome is shown for comparison. Two types of REP sequences that have the highest homology to the sequences at the 10 loci are shown in bold. Palindromic sequences in REP are indicated by short horizontal arrows. Note that each element is flanked by 2-bp sequences (in boxes). (B) Nucleotide sequences with IS621 at seven loci located outside of the 0- to 10-min region. IS621 is inserted into the 2-bp sequences shown in boxes. The nucleotide sequence without IS621 at each position in the E. coli K-12 (MG1655) chromosome is shown for comparison. Note that REP sequences at five positions (kb 1814.3, 1952.5, 2840.5, 4372.1, and 4468.0) are in a reverse orientation.
|
![]() View larger version (60K): [in a new window] |
FIG. 3. (A) The nucleotide sequence of IS621. The amino acid sequence of transposase encoded by one large open reading frame (orf) in IS621 is shown below the nucleotide sequence. A possible Shine-Dalgarno sequence preceding the initiation codon ATG is underlined. Palindromic sequences found in the region downstream of the coding region are shown by a pair of arrows. The CT sequences that appear at the junction regions of IS621 with the target sequence (Fig. 2) are underlined. Note that IS621 has no terminal inverted repeats. Four acidic residues constituting a catalytic motif are shown in circles. (B) Comparison of the pilin gene inversion site sequences and terminal sequences of IS621 and three other IS110/IS492 family elements that are closely related to IS621. Pilin gene inversion site sequences (horizontal arrows) and flanking sequences are indicated by uppercase and lowercase letters, respectively. Terminal sequences of IS621 at kb 5.6 (Fig. 1), IS492 (7), IS110 (9, 26), and IS1000 (3) and their flanking regions are shown by uppercase and lowercase letters, respectively. The CT sequences duplicated at the junction regions with IS621 are underlined. Homologous sequences between pilin gene inversion site sequences and terminal sequences of IS621 are shown in boxes.
|
The results described above suggest that IS621 recognizes REP sequences and is inserted into specific sites in their sequences. To confirm this, we carried out ADL PCR (see Materials and Methods) to identify and characterize more IS621 members that are supposed to be present in the ECOR28 chromosome. We found nine IS621 members at different loci, two of which were the same as those identified at kb 138.7 and 216.0 in the E. coli K-12 map. Seven new IS621 members had sequences identical to the three members initially identified or had substitutions of 3 bp at most. All the new members were found to be present at specific sites in the Z1-type REP sequences (Fig. 2B), confirming the above suggestion.
IS621, an IS110/IS492 family element most closely related to piv. A computer-aided homology search based on the nucleotide sequence of IS621 as the query revealed that IS621 is homologous to the piv genes encoding PIV from various bacteria (Table 2). A homology search based on the amino acid sequence of the putative protein encoded by IS621 as the query, however, revealed that IS621 has partial homology not only to PIV proteins but also to transposases encoded by the IS110/IS492 family elements in various bacteria, including even archaebacteria (Table 2 and Fig. 4). This finding is consistent with the fact that transposases encoded by the IS110/IS492 family elements have partial homology to PIV (25) and indicates that IS621 is a new member of the IS110/IS492 family. A phylogenetic tree based on the amino acid sequences of PIV proteins and transposases revealed that piv genes form a group, whereas IS elements form several groups distinct from the piv gene group (Fig. 5). Note that IS621 belongs to the piv gene group but not to the IS groups (Fig. 5).
|
View this table: [in a new window] |
TABLE 2. piv genes and IS110/IS492 family elements
|
![]() View larger version (94K): [in a new window] |
FIG. 4. An alignment of PIV proteins and transposases. Only three regions that are well conserved are shown. A RuvC Holliday junction resolvase (accession no. P24239) is aligned to show that four acidic amino acid residues [D, E (or D), D, and D, shown in boxes] in the PIV and transposase proteins are present in positions corresponding to those constituting the catalytic center in the RuvC protein. Amino acid residues present in all the proteins are indicated by asterisks. Other conserved amino acid residues are indicated by dots.
|
![]() View larger version (36K): [in a new window] |
FIG. 5. A phylogenetic tree of piv genes and IS110/IS492 family elements. The tree was constructed by the neighbor-joining method based on amino acid sequences of PIV proteins and transposases (Fig. 4). The scale bar equals a distance of 0.1.
|
The presence of four acidic amino acids conserved in PIV proteins and transposases encoded by IS110/IS492 family elements. Retroviral integrases and transposases encoded by many IS elements with terminal IRs have three amino acid residues constituting the catalytic D-D-E motif, which is responsible for the strand transfer reaction (see references 11 and 15). Recently, PIV has been reported to have a triad motif, D-E-D, which corresponds to the D-D-E motif conserved in integrases encoded by retroviruses related to avian sarcoma virus (46). The N-terminal regions of transposases encoded by the IS110/IS492 family elements, including IS621, appeared to have the D-E-D (or D-D-D) motif at corresponding positions in transposases, as occurs in PIV proteins (see the first three acidic amino acid residues shown in the boxes in Fig. 4). Interestingly, PIV and transposase proteins had another D residue conserved at the position, three amino acids downstream of the third D residue in the D-E (or D)-D motif (Fig. 4). This leads us to assume that these proteins may have a tetrad motif, D-E (or D)-D-D, like the D-E-D-D motif identified as the catalytic center in the RuvC Holliday junction resolvases (2, 21, 39). In fact, four acidic amino acid residues conserved in the PIV and transposase proteins were present in positions corresponding to those constituting the catalytic center in a RuvC protein (Fig. 4; see Fig. 3A for the positions of four acidic amino acids constituting the tetrad motif in the IS621 transposase).
The tertiary structure of a RuvC Holliday junction resolvase from E. coli has been determined by X-ray crystallography (2). The secondary structure of the RuvC protein, based on the tertiary structure, is shown schematically in Fig. 6. Note that the secondary structure of the RuvC protein was generally similar to that deduced by using the software program PSIPRED (Fig. 6). Therefore, the secondary structures of the IS621 transposase and PIV proteins were analyzed with PSIPRED and compared with those of RuvC. The secondary structures deduced for IS621 transposase and PIV proteins were found to be similar to each other and to RuvC in the regions with four acidic amino acid residues constituting the D-E-D-D motif (Fig. 6). This supports the above assumption that IS621 transposase and PIV proteins are closely related to each other and to RuvC with the D-E-D-D motif.
![]() View larger version (23K): [in a new window] |
FIG. 6. Comparison of secondary structures of RuvC, IS621 transposase, and PIV proteins. The secondary structure of RuvC, based on the tertiary structure (PDB code 1HJR), is shown above the polypeptide sequence. helices are indicated by ribbons, and ß sheets are indicated by arrows. Other features are indicated by straight solid lines. Positions of acidic amino acids constituting the D-E-D-D motif are indicated by thin vertical lines. The secondary structures of RuvC, IS621 transposase, and PIV proteins were predicted by PSIPRED and are shown under the polypeptide sequence of each protein.
|
![]() View larger version (45K): [in a new window] |
FIG. 7. Nucleotide sequences with or without another IS110/IS492 family element, ISSt1232 or IS1594. (A) Nucleotide sequences of ISSt1281 with or without ISSt1232 at two loci. ISSt1281 sequences are shown in bold. (B) Nucleotide sequences with or without IS1594 at three loci. REP-like sequences are shown in bold for comparison. The two types of REP-like sequences that have the highest homology to the sequences at the three loci are shown. Palindromic sequences in the REP-like sequences are indicated by horizontal arrows. Note that REP-like sequences at kb 266.6 and 4388.6 are identical but are in a reverse orientation. Note also that each element is flanked by 2-bp sequences shown in boxes.
|
|
|
|---|
The IS110/IS492 family includes elements with divergent nucleotide sequences, and transposases encoded by them show only partial homology to one another and to PIV (Fig. 4). A phylogenetic tree constructed on the basis of amino acid sequences of transposases and PIV proteins shows that the IS110/IS492 family elements are classified into several groups, which are distinct from the group consisting of piv genes and IS621 (Fig. 5). This distinction is the reason why piv genes could be exclusively identified by the homology search by using the nucleotide sequence of IS621 as the query.
PIV has been reported to have a triad motif, D-E-D, which corresponds to the catalytic D-D-E motif that is conserved in retroviral integrases and transposases encoded by IS elements with IRs. We have shown in this study that transposases encoded by the IS110/IS492 family elements, including IS621, appear to have the D-E-D (or D-D-D) motif at positions in their N-terminal regions that correspond to those in PIV proteins (Fig. 4). We have also shown that PIV and transposase proteins have another D residue conserved at a position downstream of the triad motif (Fig. 4) and that the four acidic amino acid residues in these proteins are present in positions corresponding to those which constitute the D-E-D-D motif identified as the catalytic center in the RuvC Holliday junction resolvase (Fig. 4 and 6). These findings strongly suggest that these proteins have a tetrad motif, D-E (or D)-D-D, as does RuvC (2, 21, 39). We have also shown in this study that PIV and transposase proteins have several amino acid residues conserved at corresponding positions in their C-terminal half regions (Fig. 4). This finding suggests that the C-terminal half of these proteins may have a domain(s) that is perhaps responsible for DNA binding to the pilin gene inversion sites or the end regions of IS elements, whereas their N-terminal half has the catalytic domain with the tetrad motif.
We have shown in this study that IS621 and two other elements, ISSt1232 and IS1594, do not have terminal IRs, which is consistent with the fact that most IS110/IS492 family elements are atypical and do not have terminal IRs. This finding and the finding that transposases encoded by the IS110/IS492 family elements have a catalytic motif that is similar to that in transposases encoded by the IR-carrying IS elements suggest that the transposases encoded by the IS110/IS492 family elements with no IRs catalyze the strand transfer reaction, as do those encoded by the IR-carrying IS elements.
We have shown that transposase encoded by IS621 has the highest homology with PIV and that terminal sequences of IS621 show significant homology with the 26-bp sequences of the pilin gene inversion sites. These findings suggest that IS621 initiates transposition through recognition of their terminal regions and cleavage at its ends by a similar mechanism to that used for PIV to promote site-specific recombination at the pilin gene inversion sites.
Interestingly, in this study we have shown that IS621 is present at a specific site in each of the REP sequences at 10 loci in the ECOR28 genome. We have also shown that ISSt1232 is inserted into a specific site within an IS element repeated in the genome of the archaebacterium S. tokodaii, whereas IS1594 is inserted into REP-like sequences repeated in the Anabaena genome. Note that it is usually difficult to determine the sequences of IS elements, particularly those which do not have terminal IRs and transpose to specific sites in repeated sequences; therefore, not only the sequences of several IS copies but also those around the target sites have to be carefully examined to define the elements.
We assume that IS621 spontaneously transposed to the same site in the REP sequences at the 10 loci in the ECOR28 genome by the action of transposase encoded by itself. It is, however, possible that IS621, once inserted in a REP sequence at one locus, transposed to the REP sequence at another locus by a gene conversion mechanism through recombination between the homologous REP sequences. This possibility is, however, unlikely, because all the IS621 members present at 10 loci are almost identical in their nucleotide sequences, whereas the REP sequences nested by IS621 are of two kinds, Z1 and Z2, which are only partially homologous to each other (Fig. 2). This leads us to assume that IS621 does not recognize the entire REP sequence in transposition but recognizes a short homologous sequence, 15 bp in length, with the target site of insertion in the REP sequences (Table 3). Similarly, in the case of IS1594, it may not recognize the entire REP-like sequence, but it recognizes a homologous 15-bp sequence with the target site in the REP-like sequences (Table 3), which are partially homologous to one another (Fig. 7B). ISSt1232 may also not recognize the entire sequence of an IS element (named ISSt1281) repeated in the S. tokodaii genome, but it recognizes a homologous sequence of 15 bp in length (Table 3), which can be identified in ISSt1281 by comparison with the target site sequences flanking each of the truncated members of ISSt1232 present in the S. tokodaii genome.
|
View this table: [in a new window] |
TABLE 3. Target sequences possibly recognized by IS110/IS492 family elements
|
It has been reported that an IS element, IS1397, which belongs to the IS3 family, is inserted into REP sequences (10, 50), like IS621. IS1397 appears to recognize a different region in REP from that recognized by IS621, because IS1397 is present in the center of the region flanked by palindromic sequences in REP, whereas IS621 is inserted into a site in the region outside of a palindromic sequence in REP (Fig. 2). In spite of this difference, IS1397 may recognize a short sequence and transpose into a particular target site within it, as does IS621.
This research was supported by a Grant-in-Aid of Scientific Research from the Ministry of Education, Culture, Sports, Science and Technology of Japan.
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»