Previous Article | Next Article ![]()
Journal of Bacteriology, November 2003, p. 6371-6384, Vol. 185, No. 21
0021-9193/03/$08.00+0 DOI: 10.1128/JB.185.21.6371-6384.2003
Copyright © 2003, American Society for Microbiology. All Rights Reserved.
CSIRO Molecular Science, Riverside Life Sciences Centre, North Ryde, New South Wales 2113,1 Department of Biological Sciences, Macquarie University, Sydney, New South Wales 2109, Australia2
Received 19 May 2003/ Accepted 12 August 2003
|
|
|---|
|
|
|---|
The IS110/IS1111 family is another atypical group. The founding member, IS110, and other original members of this group are not bounded by TIR and do not create a duplication of the target site (2, 3, 13). Subsequently, a distinct group of IS that have transposases that are closely related to the IS1111 transposase (14) but that are only more distantly related to the IS110 transposase were identified; these IS have short inverted repeats (IR) (15). In one classification (3), the IS110 family includes these two different groups of IS, on the basis of the rather limited similarities between their transposases. However, it has been suggested that a separate IS1111 family is needed (15), because there are distinct groups of transposases typified by IS110 and IS1111 and because the related IR observed at the presumptive termini of members of the IS1111 family are not found in IS110 family members.
Members of this family encode transposases that were not initially thought to include a DDE motif. However, the Piv protein, which catalyzes site-specific inversion of a segment of the Moraxella lacunata genome, was also shown to be related to the IS110 family transposases, with a small number of amino acid residues completely conserved in all proteins (16). This finding and the fact that some IS in this group are known to have preferred targets and to not produce a target site duplication led to the notion that this family of proteins represented a novel type of site-specific recombinase, rather than a classical DDE transposase (see reference 3). However, mutation of groups of conserved residues in the Piv protein was recently shown to lead to a loss of inversion activity without a loss of DNA binding, and a potential DDD (or DED) motif that appears to correspond to the DDE motif of classical transposases was identified (33).
IS4321 is a member of the IS1111 family that was first found in the multidrug resistance plasmid R751 (32). In R751, two variants of IS4321 that are 96.8% identical and designated IS4321L and IS4321R are found at the boundaries of a putative cryptic compound transposon named Tn4321. The sequence surrounding IS4321L is derived from the mer (mercury resistance) end of Tn501, and IS4321L lies within one of the two 38-bp TIR found in this fragment (Fig. 1A). Closely related 38-bp TIR are normally found at the outer ends of transposons that belong to the Tn21/Tn501 transposon family (9, 10), and IS4321R is found at the same position as IS4321L within a truncated (30-bp) copy of this 38-bp TIR.
![]() View larger version (22K): [in a new window] |
FIG. 1. Locations of known copies of IS4321 and IS5075. The 38-bp TIR of Tn21 family transposons are represented by black bars; arrowheads below the bars indicate their orientations. The identities of the adjacent sequences are indicated; e.g., tnp21 indicates the tnp end of Tn21, merpDU indicates one of the mer regions in plasmid pDU1358, and so forth. The IS4321 and IS5075 elements are represented by open boxes; arrows in the boxes indicate the positions and directions of the transposase genes. The plasmid, transposon, or organism where the IS were found are indicated on the left. Sequences were as follows (GenBank accession numbers): R751 from Enterobacter aerogenes (U67194), pRMH760 from K. pneumoniae (AY123253, AY242531, AY242532, and AY242533), pHCM1 from S. enterica serovar Typhi (AL513383), Tn5075 from E. coli (AF457211), A. baumannii (AY196695), and K. pneumoniae (NC_002941).
|
|
|
|---|
(supE44
lacU169 [
80 lacZ
M15] hsdR17 recA1 endA1 gyrA96 thi-1 relA1) was used to propagate plasmids. pRMH760 is from a K. pneumoniae clinical strain isolated at Royal North Shore Hospital, Sydney, New South Wales, Australia, in 1997 (27). pRMH777 is a subclone of pRMH760 that includes IS4321L in the TIR at the tnp end of Tn1696 (TIRtnp1696) (27). pRMH901, which contains IS5075 in the TIR at the tnp end of Tn21 (TIRtnp21), and other subclones of pRMH760 were recovered as described previously (27). The strain for transposition experiments was constructed by transforming pACYC184::Tn21 (4) and pRMH777 (27) into UB1637 (F- his lys trp recA56 rpsL) (4). Pseudomonas aeruginosa strain PAO222 is an auxotrophic derivative of PAO1 (11). Bacteria were routinely cultured at 37°C in Luria-Bertani medium or on Luria-Bertani agar, both containing ampicillin at 100 µg ml-1. Preparation of DNA. Plasmid DNA was isolated by using an alkaline lysis method (1) or purified for sequencing by using a Wizard maxiprep kit (Promega). Restriction enzymes were used in accordance with the manufacturers' instructions. Fragments were separated by electrophoresis on 1 to 2% (wt/vol) agarose gels and visualized by staining with ethidium bromide. P. aeruginosa PAO222 template DNA for PCR was prepared by suspending a single colony in 1 ml of Tris-buffered saline, pelleting the cells, resuspending them in 100 µl of Tris-EDTA, and boiling the mixture for 10 min. PCR products were purified for sequencing from Tris-acetate-EDTA-agarose gels by using an UltraClean DNA purification kit (Mo Bio Laboratories Inc.). EcoRI-digested bacteriophage SPP1 DNA (GeneWorks) and Hyperladder IV (Bioline) were used as size markers.
PCR amplification. The primers used are listed in Table 1. Amplification reactions were carried out with 50-µl volumes containing PCR Master Mix (Promega), additional MgCl2 to 3 mM, 0.4 pmol of each primer, and approximately 20 ng of plasmid template or 1 µl of P. aeruginosa DNA. Reaction conditions were generally 94°C for 3 min; 35 to 40 cycles of 94°C for 1 min, 50 to 57°C (depending on the melting temperature of the primers) for 1 min, and 72°C for 30 s to 2 min (depending on the lengths of the expected products); and a final incubation at 72°C for 5 min. To obtain sufficient template for sequencing, some products were reamplified after purification of the relevant band.
|
View this table: [in a new window] |
TABLE 1. PCR primers
|
Sequence analysis. GenBank searches were performed by using the BLASTN and FastA programs available through WebANGIS (Australian National Genomic Information Service) or through the National Center for Biotechnology Information website (http://www.ncbi.nlm.nih.gov/). Programs in the Genetics Computer Group Wisconsin Package, version 8.1.0, were used via WAG (WebANGIS GCG) to align and analyze DNA sequences. Unpublished sequence data for bacterial genomes was obtained from the Welcome Trust Sanger Institute (http://www.sanger.ac.uk/Projects/Microbes/), The Institute for Genomic Research (http://www.tigr.org), the U.S. Department of Energy Joint Genome Institute (http://www.jgi.doe.gov/JGI_microbial/html/), and the Department of Microbiology at the University of Illinois (http://www.salmonella.org). All of the IS described here have been assigned names, and their sequences have been submitted to the IS Finder database (http://www-is.biotoul.fr/is.html).
Nucleotide sequence accession numbers. The sequence data for the three copies of IS4321 and the one copy of IS5075 have been submitted to the GenBank database under accession numbers AY123253, AY242531, AY242532, and AY242533.
|
|
|---|
|
View this table: [in a new window] |
TABLE 2. IS related to IS1111
|
![]() View larger version (34K): [in a new window] |
FIG. 2. Boundaries of IS4321, IS5075, and ISPa11. (A and C) Extents of IS indicated by bars. The IR are boxed (thin lines); arrows indicate the directions of the transposase genes. The base that may originate from either the left or the right end of the IS and the base in the target adjacent to which the IS is inserted are shown in bold type. The 38-bp transposon TIR is boxed (thick lines), and residues that differ in the two common alternate types are shown. (B and D) Sequences of circular intermediates. IRr and IRl are boxed, and potential -35 and -10 regions are underlined. The sequence of the IS1383 circular intermediate (22) is shown for comparison.
|
To confirm these boundaries, we sought evidence for circular intermediates of IS5075 and IS4321. pRMH777 (27) and pRMH901, subclones of pRMH760 containing a single IS, were used as templates with outward-facing PCR primers, which should detect only circular intermediates or tandem copies of the IS (Fig. 3A). A product of the correct size (433 bp) was detected, and digestion with restriction enzymes that should either cut both products or distinguish between the IS5075 and IS4321 products yielded fragments of the predicted sizes (data not shown). The sequences of the PCR products (Fig. 2B) revealed that the right and left 12-bp IR (IRr and IRl, respectively) were separated by a 10-bp sequence, IRr-AGATAATGAG-IRl, that comprises the abutted terminal sequences of the IS predicted above. This configuration indicates that the IS can be excised precisely and circularized and is consistent with the conclusion that the outlying bases are an intrinsic part of the IS. A circular form of IS1383, which also belongs to the IS1111 family, was identified previously (22), and it also includes an additional 10-bp sequence between the IR of the IS (Fig. 2B). Although these authors concluded that the intervening 10 bp consisted of 5 bp from each side of the IS, the same 10-bp sequence could also consist of 7 or 6 bp from the left end and 3 or 4 bp from the right end, as described here for IS4321 and IS5075, as two bases (bold type in Fig. 2B) could be derived from either the left or the right end of IS1383.
![]() View larger version (31K): [in a new window] |
FIG. 3. Movement of IS4321. (A) Substrates used to detect the movement of IS4321. Features of the IS, TIR, and adjacent regions are as described in the legend to Fig. 1. pACYC184 and pRMH777 backbone sequences are indicated by broken and thin lines, respectively. The positions and orientations of primers used are indicated by arrowheads with numbers to identify the RH primer (Table 1). The extents of PCR products are indicated by bars; their lengths are shown below the bars. (B) Sequences of PCR products. PCR products are named by the primer pairs giving rise to them. TIR, IR, and ambiguous bases are indicated as described in the legend to Fig. 2; the name of each region is shown above the sequence. The plasmid backbone sequence is shown in lowercase type.
|
To detect a reconstituted 38-bp target sequence, which would arise if the flanking sequence could be rejoined after the IS was excised, plasmid DNA from a strain containing pRMH777 was also analyzed with a pair of PCR primers flanking the TIRtnp1696 in pRMH777 (Fig. 3A). In addition to a fragment of 1.74 kb, corresponding to the TIR containing IS4321, a faint band of the expected size (415 bp) following the loss of IS4321 was seen. The sequence of this band (Fig. 3B) revealed that IS4321, including the extra bases outside the IR, had been excised precisely to recreate the 38-bp TIR.
P. aeruginosa ISPa11. The P. aeruginosa PAO1 chromosome (31) contains six complete copies of a gene encoding a transposase (open reading frames PA0445 in GenBank accession no. AE004481-2, PA2319 in AE004658, PA2690 in AE004697, PA3434 in AE004764, PA3993 in AE004817, and PA4797 in AE004892-3) that is 42.6% identical to that of IS4321, but the boundaries of the IS have not been reported. Analysis of the DNA sequences of these six regions revealed that they are identical over a stretch of about 1,380 bp that includes 13-bp IR related to those of IS4321 (8 of 12 bp identical) and IS1383 (12 of 13 bp identical) but extends for several base pairs to the left and right (Fig. 2C); these findings suggest that the IS also may target a specific sequence. Searches with the flanking sequence, reconstituted by assuming that the left- and right-hand terminal sequences have lengths equivalent to those of IS4321, IS5075, and IS1383, revealed over 50 uninterrupted copies of the target sequence in the PAO1 chromosome, permitting the ends to be deduced. The circular intermediate of ISPa11 was detected by PCR with DNA from strain PAO222, an auxotrophic derivative of PAO1 (11), and appropriate primers (Table 1). The sequence of the junction (Fig. 2D) confirmed that the IS does indeed include 6 or 7 bases beyond IRl and 4 or 3 bases beyond IRr. It appears that the presence of at least one residue of ambiguous origin (left or right) is a feature of this family of IS, although the identity of the residue(s) is not conserved.
Promoter in the circular intermediate. A promoter consisting of a -35 region located just inside the right-hand end of the IS and a -10 region created by the fusion of the right- and left-hand terminal sequences was detected in the circular form of IS1383 (22). Equivalent -35 and -10 regions are present in the circular intermediates of IS5075 and IS4321 and the P. aeruginosa IS detected here (Fig. 2). Equivalent promoters are also generated by fusion of the deduced ends of several additional IS1111 and IS4321 relatives (see below), indicating that this promoter in the circular intermediate is also a characteristic of this family. It is likely to play a role in the movement of IS1111 family members, perhaps transiently increasing the expression of the transposase. As the expression of the transposase presumably is required first to form the circular intermediate, we searched for expression signals upstream of the transposase gene but within the left end of the IS. A potential promoter was found in IS4321, IS5075, and the related IS1328 and ISSfl8 but not in IS1383 and ISPa11, which may rely on promoters in the adjacent sequence for the initial expression of the transposase.
IS1111 subgroup of the IS1111 family. A small number of IS with transposases related to that of IS1111 and with IR at or near their ends were identified previously (3, 15, 29, 39). Searches of DNA sequences in GenBank revealed many additional predicted proteins related to the IS1111 and IS4321 transposases. Those that are complete and exhibit more than 30% identity to the IS4321 and IS1111 transposases, together with a few examples that include frameshifts in the transposase gene (either errors or mutations) that can be tentatively corrected by using the sequence of their next closest relative as a guide, were selected for further analysis (Table 2). The proteins range in length from 334 to 354 amino acids, and alignment of the sequences (Fig. 4) revealed a small number of completely conserved amino acids and several more that are conserved in all but a few of these proteins. At several additional positions, only one of two alternative amino acids (e.g., D or E, R or K, and S or T) is present. Searches of available incomplete bacterial genome sequences revealed many more IS1111 and IS4321 relatives (Table 3), and an alignment (data not shown) of all 46 complete transposase sequences that did not require frameshift correction revealed the same conserved residues (residues conserved in 45 or 46 of these 46 sequences are indicated by asterisks in Fig. 4). In the N-terminal domain, the conserved amino acids include five acidic residues, D residues at positions 15, 100, and 103 and E residues at positions 60 and 156, and four basic residues, K18, R41, R70, and K98. The D15, E60, and D100 or D103 residues correspond to the DDD residues in the Piv protein that are thought to correspond to a transposase DDE motif (16). Additional blocks of conserved residues are found in the C-terminal portions at positions equivalent to stretches of conserved residues identified in previous alignments of Piv with IS110/IS1111 family transposases (16).
![]() View larger version (103K): [in a new window] |
FIG. 4. Alignment of IS1111 family transposases. Twenty transposases from the IS shown in Table 2 are aligned. Only one example was included for pairs of proteins that were >90% identical (IS4321L and IS5075, IS1618 and ISSm1, and IS1492 and ISPsy16). An alignment of all 46 transposases without frameshifts from Tables 2 and 3 was used to define conserved amino acids. Residues conserved in all or all but 1 of the 46 sequences are indicated by asterisks above the sequence. Residues differing in 6 or fewer of the 46 sequences are shown as white on black and are indicated by uppercase letters below the sequence. Where only two alternative amino acids in 45 or 46 sequences are found, the two alternatives are indicated by lowercase letters below the sequence. The sequences used are from translations of nucleotide sequences identified by the GenBank accession numbers listed in Table 2.
|
|
View this table: [in a new window] |
TABLE 3. IS related to IS1111 in genome sequencing projects
|
It was previously assumed that, by analogy with the majority of IS, the IR represent the termini, and similarities noted between the sequences flanking the IR for IS4321 and IS1328 (32) and for several IS belonging to the IS1111 group (15, 38, 39) were believed to indicate an element of target site specificity. As the terminal sequences could account for part of these regions, we sought evidence for their presence in additional IS. Where both IR could be identified, the flanking sequences (up to 100 bp from each side) were joined, assuming that 10 extra bases are part of the IS and that there is no duplication of the target sequence. The resulting sequence was used to search for copies of the uninterrupted target sequence that would allow deduction of the ends of the IS. Sufficiently closely related sequences were found in many cases, and in all of them it appeared that terminal sequences were present. The sequence of the junction in each of the predicted circular intermediates is shown in Fig. 5, with the base(s) that could be derived from either the right- or the left-hand side indicated in bold type. The terminal sequences are related and, when joined, most include a potential -10 region at a suitable distance (16 to 18 bp) from a -35 region in the right-hand end. An additional feature of the left-hand terminal sequence is that it includes a tetranucleotide similar to the outer 4 bp of the IR. This tetranucleotide is ATGG or ATGA in most examples but TTGG for the group with IR beginning with GTGG. All but a few of the examples found include at least 1 bp that could have originated from either the right- or the left-hand end (bold type in Fig. 5), confirming that this feature is a general property of IS1111 family members.
![]() View larger version (54K): [in a new window] |
FIG. 5. Sequences of predicted circular forms of IS1111 family elements. The sequences shown are deduced from comparisons of sequences flanking the IS and uninterrupted target sequences. Features are indicated as described in the legend to Fig. 2B. The distances to the initiation codons of the transposase genes are also shown. Where the comparison predicts the origins of all central bases, a vertical arrow indicates the junction of the left and right ends.
|
The length of the left-hand terminal sequence was not absolutely conserved. Some IS included 7 or 6 bp on the left and 3 or 4 bp on the right, as was found experimentally for IS4321, ISPa11, and IS1383; others included 6 or 5 bp on the left and 3 or 4 bp on the right. The latter group would have 9 bp separating the IR in the circular intermediate and would include four members of the IS1618 group described above (Fig. 5). Alignment of the circular junctions reveals that, for the IS1618 group, the first residue of IRl (which is 1 bp longer than IRr) may be equivalent to the 10th base of the 10-bp spacer sequence, raising the possibility that the two IR may be recognized in slightly different manners. Additional IS with predicted terminal sequences totaling 9 bp fell into two groups. For ISPpu11 and ISSfr1, a -10 promoter region was formed, and there was at least 1 base of ambiguous origin. For IS1328 and ISAzvi2, these features were not apparent, and indeed close relatives have terminal sequences totaling 10 bp. It is possible that in these IS, there is an error in one of the sequences compared. This notion is supported by the fact that the terminal sequences of the closest relative are very similar but include an additional base at the position normally occupied by the one that could be derived from either end. Furthermore, if a 10-bp spacer is assumed, a -10 region is expected. For the remaining IS listed in Tables 2 and 3, an uninterrupted target sequence was not found, and the length of the element was calculated assuming that the terminal sequences total 10 bp or, for the IS1618 group, 9 bp. Generally, the sequences of the termini of these IS were conserved, and a base of ambiguous origin was found in most of them.
Target site specificity. A few IS1111 family members are known to have a preferred target site into which they insert in a preferred orientation. IS1111 is found at one end of related stem-loop structures that were originally thought to form part of the IS (14), and all of the copies in the Coxiella burnetii genome (30) were found to be similarly located (Fig. 6A). Over 50 additional copies of this target sequence were identified in the C. burnetii genome. IS1383 targets the 12-bp TIR of IS1384 (22), and preferred targets for IS4321 or IS5075 and ISPa11 were also identified in this study (Fig. 2). However, there is no obvious relationship among the sequences of the targets for these IS, indicating that each group of closely related IS selects a different target. For all of the remaining IS for which multiple copies have been sequenced, when the sequences adjacent to their ends were aligned, a potential target site sequence was found; for some, however, sequence identity extended for only a few base pairs on either side of the IS (Fig. 6C). For ISPpu11, ISPsy7, ISAzvi2, ISEch3, and ISYen1, the targets were longer, and multiple copies of them were found in the genome of the organism where the IS was found; these results indicate that genomic repeat sequences were used as targets. For four members of the IS1618 group, the targets were a set of three 38-bp TIR that are closely related to one another and to the TIR of Tn6901 (GenBank accession no. AP004237) and that are related to but distinct from the TIR of Tn21 (Fig. 6B). The IS inserts at a position different from that recognized by IS4321 and IS5075. Short regions of similarity to the target site region of the TIR were found adjacent to two additional members of the IS1618 group (Fig. 6B). These findings suggest that preferred targets may be a general feature of the IS1111 family but that a variety of different sequences are used as targets for individual IS.
![]() View larger version (37K): [in a new window] |
FIG. 6. Targets for additional IS. The IS is inserted to the right or left of the base in bold type. (A) Stem-loop structures targeted by IS1111. (B) The 38-bp TIR and related sequences recognized by members of the IS1618 group are compared to the 38-bp TIR recognized by Tn4321 and Tn5075. Dots indicate unknown sequences. (C) Sequences targeted by additional family members. Bases in lowercase letters are not completely conserved.
|
|
|
|---|
Most of the properties of IS4321 and IS5075 appear to be general features of IS1111 family members, as many additional family members were found to include short IR, usually 11 to 13 bp, and terminal sequences (Tables 2 and 3 and Fig. 5). The IR sequences fall into a number of distinguishable groups but clearly are related. In most of them, the sequence at the outer ends of the IR is related to the consensus sequence 5'-ATGGACGC-3' and is followed by a short tract of predominantly C residues. Evidence for terminal sequences comes from the sequence of the circular intermediate of ISPa11 and from comparisons to uninterrupted target sequences, which were identified in many examples. The terminal sequences generally total 10 bp, although in a few examples they appear to be 1 bp shorter. For members of the IS1618 group, where 9 bp separate the IR in the predicted circular intermediate, other distinguishing features are shared: a characteristic 3-amino-acid insertion and a 5-amino-acid deletion are present in the transposase, the outer ends of the IR are distinctive (5'-AAGGG and CCCTT-3'), and IRl includes 1 bp more than IRr. The latter feature may be related to the apparent 9-bp spacer if IRl is recognized in a manner different from that of IRr, i.e., as AGGG or as AAGG. A comparison of the 9-bp sequence to the 10-bp consensus sequence seen in other circular intermediates reveals that the 10th base is missing and that the 1st base of IRl may be equivalent to it. For some of the remaining examples where a 9-bp spacer was predicted, the 4th base to the right (or the 7th base to the left) is missing, and close relatives have 10-bp spacers. Here, it is possible that the prediction has missed 1 bp. An examination of the sequences of the appropriate circular intermediates should clarify this matter.
An additional general feature of the IS1111 family members examined here is that a -10 promoter region is formed when the ends are brought together in the circular intermediate (Fig. 2 and 5). This -10 region is generally separated by an appropriate distance from a -35 region found at the right-hand end of the IS, and this promoter may be important for expression of the transposase, as has been shown for IS911 (34, 35). The -35 region is also likely to influence the expression of adjacent genes if a -10 region at an appropriate distance, i.e., at the boundary of the IS, is formed. A potential promoter was also found completely within the left-hand end of the IS in a fraction of examples. The overall organization of these IS is also conserved, with 50 to 100 bp separating IRl from the beginning of the transposase gene and 200 to 300 bp between the termination codon and IRr. It seems unlikely that such a long downstream region would be retained if it had no function, but a short open reading frame was found in only some examples and the features it contains thus remain to be established.
A number of IS1111 family members clearly have a preferred target site. These targets include short sequences of up to 16 bp as well as the TIR of other IS or transposons and various sequences found in multiple copies in the genome of a particular organism. In some of the examples evaluated here, the genomic repeats are comprised of IR that can fold to form a stem-loop structure. How these targets are recognized and recruited remains to be established, but it is possible that some structural feature of the DNA rather than the sequence itself is important, as has been proposed for IS231, which also targets a 38-bp TIR sequence (12).
IS families are based on similarities in organization, transposases, and IR (3, 20). The analysis presented here supports the proposal of Lauf et al. (15) that the IS1111 group represents a family. IS that were previously designated as belonging to the IS1111 subgroup of the IS110 family (3), and the additional ones that were examined here have transposases that are generally more closely related to one another (>30% identity) than to the transposases of the IS110 family (generally less than 20% identity). Members of the IS1111 family also differ from members of the IS110 family in that they include short, 11- to 13-bp IR. This separation will greatly simplify the analysis of this burgeoning IS type.
It is possible that additional subgroups of the IS1111 family will be found in the future as further detailed analyses are undertaken. For example, IS1533, which is found in multiple copies in the chromosome of several Leptospira species, has been grouped with IS1111 (3, 15). IS1533 was one of the first IS110/IS1111 family members sequenced, and IR flanked on both sides by additional conserved residues were identified (38, 39). Although in one example the N-terminal portion of the predicted transposase sequence showed similarities to the sequences of a group of IS110-related transposases, including IS1111 (39), the other predicted product was not the same, and both sequences appear to include multiple frameshifts in the transposase gene. An additional IS1533 sequence (GenBank accession no. X77623) found in our searches predicts a transposase that is about 24% identical to the IS4321 and IS1111 transposases and aligns over its full length with those in Fig. 4, making it a member of the broader IS1111 family. Searches with this sequence revealed a small number of IS1533 relatives (>30% identity), and their alignment revealed clusters of conserved residues at positions similar to those in the IS1111 subgroup (Fig. 4), with several completely conserved residues common to both groups. However, there were significant differences. In particular, only three of the four possible acidic residues that form the DED motif in the IS1111 subgroup were also completely conserved in the IS1533 subgroup. The G-D--K and EA motifs were conserved, but the K-D--DA motif in the IS1111 subgroup was K-D/N--DA in the IS1533 subgroup, indicating that the critical D residue is likely to be the second one. A fifth conserved E residue lying to the right of the DED motif was present in IS1533 relatives, but it was closer to the K-D-DA motif than E156 in Fig. 4. For most members of this IS1533 subgroup, IR and possible terminal sequences could be found. However, the IR were shorter and, with the exception of IS1533, usually more closely related to one another than to those of the IS1111 subgroup. Although a more detailed analysis of this group is needed to resolve the precise relationships, the members of the IS1533 subgroup are most closely related to those of the IS1111 subgroup and belong to the IS1111 family.
Members of the IS110 family share features with the IS1111 family in addition to the similarity in their transposase sequences, which all include a small number of completely conserved residues. A comparison of the sequences of integrated copies with the target sequence allowed their boundaries to be identified and led to the conclusion that these IS were integrated conservatively, i.e., without the creation of a short duplication of the target site. Although for IS492 a 5-bp duplication was proposed (28), one copy of this sequence was found in a circular intermediate and the data would be equally well explained if one copy were part of the IS and the other copy were present in the target but fortuitously or necessarily identical. IS4321 and IS5075 also changed locations without duplicating any residues in the target. Circular forms of IS117 (13) and IS492 (28) have been reported and, for IS492, the formation of a strong promoter at the circle junction has been demonstrated. A number of other IS which belong to the IS110 subgroup of the IS110 family appear to have preferred targets (3, 5).
A number of questions about the IS1111 family and its relationship to the IS110 family remain to be answered. First, we have not extended our bioinformatic analysis exhaustively, and it is possible that additional subgroups of the IS1111 family beyond the IS1111 and IS1533 subgroups will be found among the IS whose transposases were less than 30% identical to that of IS1111, the value which was used as the cutoff in our analysis. Examination of a small number of examples revealed the presence of IR and, although the IR that we found are shorter than those reported here for the IS1111 family, they were related to the outer 8 bp. Second, whether other IS presently classified as IS110 family members have IR has not always been examined. Imperfect IR have been noted near the boundaries of IS492, as deduced from the sequence of the circular form (28), and we found the configuration CCAT-10 bp-ATGG in that sequence. It is possible that very short IR are also present in other IS. Third, the way in which these IS recognize their preferred target and the features of the target that are recognized remain to be elucidated. In vitro analysis to identify the regions in the IS and in the target bound by individual transposases should assist in clarifying some of these issues. It also remains to be established whether the conserved DDD motif in the IS110 family and the DED motif in the IS1111 family are indeed equivalent to the DDE motifs of classical IS or whether the mechanism of movement of IS110 and IS1111 family members is indeed more akin to site-specific-recombination.
S.R.P. was supported by grant no. 192108 from the Australian National Health and Medical Research Council.
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»