| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
,
Axel W. Strittmatter,2
Anke Henne,2,
Gerhard Gottschalk,2 and
Susanne Fetzner1*
Institut für Molekulare Mikrobiologie und Biotechnologie, Westfälische Wilhelms-Universität Münster, D-48149 Münster, Germany,1 Laboratorium für Genomanalyse, Institut für Mikrobiologie und Genetik, Georg-August-Universität Göttingen, D-37077 Göttingen, Germany2
Received 17 January 2007/ Accepted 27 February 2007
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
|
For the genus Arthrobacter, pAL1 was the first extrachromosomal DNA element shown to be a linear replicon (40). As a first approach to understand the function of pAL1, we determined its complete sequence in order to perform a functional annotation of its putative genes and to model secondary structures of putative single-stranded 3' telomeric overhangs of this plasmid. We analyzed the operon organization of catabolic genes on pAL1 and the carbon source-dependent expression of these genes.
| MATERIALS AND METHODS |
|---|
|
|
|---|
325 nm of 10.2 mM1 cm1 was used to estimate 1H-4-oxoquinaldine concentrations. When aromatic substrates were consumed, portions of substrate stock solutions were added to the cultures to obtain the appropriate final concentrations. A pAL1-deficient mutant of strain Rü61a (40) was grown in the presence of streptomycin (50 µg/ml) and rifampin (25 µg/ml). The involvement of a canonical molybdenum hydroxylase in hypoxanthine utilization was assessed by replacing the molybdate in the mineral salts medium by the same concentration of tungstate. To test the possibility that there was carbon catabolite repression of degradation of aromatic compounds, cells of A. nitroguajacolicus Rü61a grown for about 16 h on succinate were harvested by centrifugation, washed twice in saline, and resuspended in mineral salts medium with 10 mM succinate supplemented with either 2 mM quinaldine or 2 mM 1H-4-oxoquinaldine. Escherichia coli DH5
(17), which was used as a plasmid host, was grown at 37°C in lysogeny broth (LB) (52) supplemented with ampicillin (100 µg/ml) if appropriate. For amplification of cells carrying the shotgun library of the pAL1 plasmid, chemically competent E. coli One Shot TOP10 cells (Invitrogen, Karlsruhe, Germany) were transformed and were grown at 37°C and 350 rpm in 2x LB for 20 h.
DNA techniques.
Genomic DNA of A. nitroguajacolicus Rü61a and of the pAL1-deficient mutant was isolated by using the method of Rainey et al. (46). Plasmid DNA was obtained from E. coli DH5
clones with an E.Z.N.A. plasmid mini kit I (peqlab, Erlangen, Germany). Competent E. coli cells were prepared as described by Hanahan (19). DNA restriction and agarose gel electrophoresis were carried out using standard procedures (52). PCR was performed using the Expand High Fidelity PCR system (Roche, Mannheim, Germany) or the Triple Master PCR system (Eppendorf, Hamburg, Germany). Random-primed labeling of probes, blotting, hybridization, and colorimetric detection with nitroblue tetrazolium salt and 5-bromo-4-chloro-3-indolylphosphate were performed by using the methods recommended in the DIG System user's guide for filter hybridization (Roche Molecular Biochemicals, 1995).
Preparation and subcloning of pAL1 DNA.
Preparation of cell plugs, which always included proteinase K treatment, pulsed-field gel electrophoresis, and isolation of pAL1 from gels by electroelution were performed as described previously (40). For construction of shotgun libraries, purified pAL1 plasmid DNA was partially digested for 10 to 20 s using the blunt-cutting enzyme Bsp143I or AluI. DNA fragments were purified by gel electrophoresis (fragment size, 2.0 to 3.0 kb). After gel elution with Qiaquick (QIAGEN, Hilden, Germany), DNA fragments were filled in using T4 polymerase, 5' adenylated using Taq polymerase, dephosphorylated by treatment with calf intestinal phosphatase in a buffer recommended by the supplier (all enzymes were obtained from MBI Fermentas, Vilnius, Lithuania), and cloned into the pCR4-TOPO vector (Invitrogen). Recombinant plasmids were transformed into chemically competent E. coli One Shot TOP10 cells. For cloning of the terminal fragments of pAL1, plasmid DNA was digested with PstI, and fragments were ligated into pBluescript II SK(+) digested with PstI and blunt-cutting EcoRV. E. coli DH5
was transformed with the ligation mixture, and transformants were selected on LB agar plates containing 100 µg/ml ampicillin, 40 µg/ml isopropyl-ß-D-thiogalactopyranoside, and 400 µg/ml 5-bromo-4-chloro-3-indolyl-ß-D-galactopyranoside. Plasmids pBSK5 and pBSK3, containing 3.0- and 2.0-kb inserts, respectively, were recovered from the transformants. To ensure coverage of the ends of pAL1, terminal fragments were cloned again from A. nitroguajacolicus Rü61a genomic DNA, which was isolated using a protocol (57) that in addition to proteolytic digestion includes treatment with alkali in order to cleave the alkali-labile ester linkage between the residual peptide of Tp and the DNA 5' ends. PstI-digested genomic DNA was then hybridized with probes "lt" and "rt," which were obtained by PCR amplification from pBSK5 and pBSK3, respectively (see Table S1 in the supplemental material for a description of the primers). Genomic DNA isolated from the pAL1-deficient mutant of A. nitroguajacolicus Rü61a did not hybridize with these probes. Hybridizing PstI fragments of DNA from the wild-type strain were extracted from agarose gels and ligated into PstI/EcoRV-digested pBluescript II SK(+) as described above. Four and five plasmids containing the 3.0- and 2.0-kb terminal fragments, respectively, were identified by colony blotting of E. coli DH5
transformants using probes "lt" and "rt," and all inserts were sequenced.
DNA sequencing. Sequencing of isolated plasmid DNA was performed as described previously (48), using BigDye Terminator 3.1 chemistry and a 3730XL capillary sequencer (Applied Biosystems, Darmstadt, Germany). For sequencing of pCR4 and pBluescript II SK(+) derivatives, standard vector primers were used (see Table S1 in the supplemental material). For direct sequencing of derived PCR fragments, custom-made PCR primers were used in standard sequencing reactions performed with recommended annealing temperatures.
RNA extraction and RT-PCR. For isolation of total RNA, A. nitroguajacolicus Rü61a was grown in mineral salts medium with different carbon sources to an optical density at 600 nm of about 1.0. Aliquots (6 ml) of the cultures were frozen in liquid nitrogen and stored at 80°C. After thawing on ice, bacteria were harvested by centrifugation at 4,000 x g and 4°C for 2 min. The cells were resuspended in 100 µl of Tris-EDTA buffer (pH 8.0) containing 14 mg lysozyme/ml and 20 U RNase inhibitor (RNasin Plus; Promega, Madison, WI) and incubated at 28°C for 3 h. Total RNA was isolated from the pretreated cells with an RNeasy kit (QIAGEN) by following the instructions of the supplier, including an on-column DNase digestion step. Residual DNA was removed by digestion with 1 U RNase-free DNase (Promega) per 1 µg RNA in the presence of 20 U RNase inhibitor. Samples were incubated for 45 min at 37°C, the DNase was inactivated, and the RNA was repurified. Reverse transcription (RT)-PCR was performed with a RevertAid H minus first-strand cDNA synthesis kit (MBI Fermentas). The cDNA synthesis reaction was carried out at 43°C with 1 µg of total RNA and random hexamer primers. For negative controls, reverse transcriptase was omitted from the reaction mixture. A PCR for amplification of cDNA was performed by using 10-µl assay mixtures containing 1 µl of cDNA, 20 pmol (each) of the forward primer and the reverse primer (see Table S1 in the supplemental material), and 0.75 U of GoTaq DNA polymerase (Promega).
Identification of transcriptional start sites. Transcriptional start sites were determined by rapid amplification of 5' cDNA ends (5'-RACE) using a 5'/3'-RACE kit from Roche according to the manufacturer's instructions. For cDNA synthesis, 1 µg of total RNA, isolated from quinaldine-grown cells, and specific primer SP1 (see Table S1 in the supplemental material) were added. In the next steps, nested primers SP2 and SP3 (see Table S1 in the supplemental material) were used to obtain specific products of the tailed cDNA, and PCR was carried out with the Expand High Fidelity PCR system (Roche). PCR products were purified by gel extraction (E.Z.N.A. gel extraction kit; peqlab) and were sequenced (MWG Biotech, Ebersberg, Germany).
Enzyme assays and polyacrylamide gel electrophoresis. A. nitroguajacolicus Rü61a cells grown on different carbon sources were suspended in 100 mM Tris-HCl buffer (pH 8.0), treated with 2 mg/ml lysozyme for 30 min at 37°C, and disrupted by sonication on ice. Crude extracts containing soluble proteins were obtained by centrifugation for 45 min at 36,000 x g. Qox activity was determined spectrophotometrically by measuring the quinaldine-dependent reduction of the artificial electron acceptor iodonitrotetrazolium chloride (INT) (41). Hod activity was measured as described previously (15), using 1H-3-hydroxy-4-oxoquinaldine as the substrate. Protein concentrations were estimated by the method of Bradford, as modified by Zor and Selinger (69), using bovine serum albumin as the standard protein. Nondenaturing polyacrylamide gel electrophoresis was performed in resolving gels with a final acrylamide concentration of 12% (wt/vol) or 7.5% (wt/vol), using the high-pH discontinuous system described by Hames (18). For activity staining of xanthine oxidoreductase, gels were immersed in 100 mM Tris-HCl buffer (pH 8.5) containing 0.3% (vol/vol) Triton X-100, 1.5 mM INT, and 1 mM hypoxanthine or xanthine. For activity staining of Qox, gels were soaked in the same buffer containing 1.25 mM INT and 50 µM quinaldine.
Sequence analysis. The sequence of pAL1 was determined using a standard shotgun library with an insert size of 2 to 3 kb. A total of 1,484 reads with an average read length of 616 bp were performed. After the first assembly using the Phrap assembly tool (http://www.phrap.org), primer walking on plasmids and PCR-based techniques were used to close remaining gaps and to solve misassembled regions. All manual editing steps were performed using the GAP4 software package (v4.5 and v4.6) (58, 59). After sequence polishing and finishing, the plasmid sequence had 7.0-fold sequence redundancy and assembled into a single contig. Coding regions of pAL1 were identified with the ARTEMIS DNA annotation tool (49), with the heuristic approach of GENEMARK (6), and with FRAMES at HUSAR 4.0 (http://genome.dkfz-heidelberg.de/). Sequences were analyzed with the BLAST family of programs (1) for database searches, GAP for binary alignments and calculation of similarities and identities, and ClustalW (21) and T-Coffee (39) for multiple alignments. Open reading frames (ORFs) with hypothetical ribosome binding sites but lacking BLAST hits were manually annotated. Hypothetical gene products were scanned with PROSITE (http://www.expasy.org/prosite/) to obtain functional information. The SOSUI program (24) was used to search for putative transmembrane helices. Possible secondary structures of putative terminal 3'-strand overhangs of replicative intermediates of pAL1 were computed with mfold (version 3.1) (53, 70), based on a folding temperature of 30°C and Na+ concentrations of 100 mM to 1 M. Theoretical pI values were calculated using the Compute pI/Mw tool at http://www.expasy.ch.
Nucleotide sequence accession number. The complete sequence of linear plasmid pAL1 from A. nitroguajacolicus Rü61a has been deposited in the EMBL nucleotide sequence database under accession number AM286278.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
A total of 103 ORFs were identified on pAL1, which cover 84.6% of the plasmid sequence (Fig. 2). ORFs which were functionally annotated are briefly described in Table 1. Table S2 in the supplemental material lists all ORFs identified on pAL1. For 49 of the putative genes, no function could be predicted. Genes coding for quinaldine degradation and genes presumed to code for reactions involved in anthranilate catabolism are clustered in a 31-kb region (ORFs 3 to 25), whereas a 53-kb region (ORFs 63 to 103) appears to code for proteins involved in DNA mobilization, plasmid maintenance, and DNA replication and repair. Genes presumed to have a role in telomere patching are localized at the end of pAL1 (ORFs 101 to 103).
|
|
The functions of qoxLMS (ORFs 4 to 6) encoding quinaldine-4-oxidase (Qox), hod (ORF 8) coding for the 2,4-dioxygenase catalyzing heterocyclic ring cleavage, and amq (ORF 9) encoding N-acetylanthranilate amide hydrolase were determined by heterologous gene expression analysis (15, 32, 41). The physiological role of the 1H-4-oxoquinaldine 3-monooxygenase gene moq (ORF 7) was confirmed by interposon mutagenesis (K. Parschat, unpublished data). The presumed superoxide dismutase (SOD) encoded by ORF 14 belongs to the family of iron- or manganese-containing SODs. SOD activity might be particularly necessary when A. nitroguajacolicus Rü61a grows on quinaldine, since incomplete reduction of O2 by Qox, which is presumed to use dioxygen as its physiological electron acceptor, would produce superoxide anion radicals instead of H2O2. A. nitroguajacolicus Rü61a has another, probably chromosomal, SOD gene, since PCR performed with template DNA from the pAL1-deficient mutant and primers specific for ORF 14 (see Table S1 in the supplemental material) resulted in a 544-bp DNA fragment that exhibited 97% identity to the corresponding part of ORF 14.
ORF 3, which is localized directly downstream of qoxLMS, codes for an XdhC-like protein which, by analogy to the archetypal XdhC protein (38), may be involved in insertion of the molybdenum cofactor (Moco) into the Qox protein. Homologs of xdhC are often localized in the vicinity of genes encoding molybdenum hydroxylases; in the case of pucABCDE (55), xdhABC (36), and the quinoline catabolic gene cluster (7), they are even transcribed in an operon.
The ORF 11 protein (ORF11p) of pAL1 exhibited 24% identity to E. coli MobA (accession no. P32173). MobA catalyzes the condensation of Mo-molybdopterin and GTP, forming molybdopterin guanine dinucleotide. However, Qox, like many catabolic molybdenum hydroxylases, contains the molybdopterin cytosine dinucleotide form of the molybdenum cofactor. It is interesting that asparagine residue 53 and aspartate residue 71 of E. coli MobA, which have been proposed to determine its selectivity for GTP (33), are replaced in ORF11p by leucine and arginine, respectively. Remarkably, these residues are also found at corresponding positions in the MobA-like proteins of Arthrobacter nicotinovorans (ORF 204 of pAO1; accession no. AAK64261) and Pseudomonas putida 86 (ORF 4; accession no. CAE47360). Since the corresponding mobA-like genes also are clustered with genes encoding molybdoenzymes with the molybdopterin cytosine dinucleotide cofactor, it is tempting to speculate that the conserved L and R residues mediate specificity for CTP.
ORF12p is similar to MoaA-like proteins, which together with MoaC catalyze the first step in Moco biosynthesis, and ORF13p exhibits 36% identity to MoaE of E. coli (accession no. AAC73872), which is a subunit of molybdopterin synthase. However, homologs of moaC, moaD, moeB, mogA, moaA, and moaB, which are required for Moco biosynthesis in prokaryotes, and modABC genes encoding molybdate uptake proteins were not detected on pAL1. Growth of bacteria on hypoxanthine usually requires a Moco-dependent xanthine oxidoreductase. Replacement of molybdate in the mineral salts medium by tungstate completely suppressed growth and hypoxanthine utilization by strain Rü61a and its pAL1-deficient mutant (data not shown), confirming that these strains indeed contain a canonical xanthine oxidoreductase. In the presence of molybdate, the two strains grew equally well on hypoxanthine as a sole carbon source. Moreover, catalytically active xanthine oxidoreductase was detected in crude extracts of hypoxanthine-grown cells of both wild-type and plasmid-deficient A. nitroguajacolicus Rü61a (data not shown), indicating that the genome of the pAL1-deficient mutant contains all of the genes for Moco biosynthesis.
Genes presumed to be involved in metabolism of aromatic compounds. Based on the isolation of catechol as an intermediate in the quinaldine degradation pathway and the detection of catechol 1,2-dioxygenase activity (27), degradation of anthranilate by A. nitroguajacolicus Rü61a was proposed to proceed via the classical ß-ketoadipate pathway (Fig. 1B). However, genes that might code for a catechol-forming anthranilate 1,2-dioxygenase and a catechol dioxygenase were not identified in pAL1. Since the pAL1-deficient mutant of A. nitroguajacolicus is able to utilize anthranilate and catechol as carbon sources (data not shown), such an ortho cleavage pathway may well be encoded on the chromosome. However, sequence analysis of ORFs 19 to 23 of pAL1 suggested that a second route for metabolism of anthranilate may be present in strain Rü61a.
The amino acid sequence deduced from the ORF 22 sequence exhibits 52% identity with the sequence of the 2-aminobenzoate-coenzyme A (2-aminobenzoate-CoA) ligase from Azoarcus evansii (accession no. AAL02069). ORF 23 codes for a protein that exhibits 50% identity to the natural fusion protein 2-aminobenzoyl-CoA monooxygenase/reductase (accession no. AAL02063), which in A. evansii catalyzes the formation of 2-amino-5-oxo-cyclohex-1-ene-carbonyl-CoA from 2-aminobenzoyl-CoA (54). Two putative thioesterases are encoded by ORFs 21 and 25; ORF 21 codes for a hypothetical acyl-CoA thioesterase II (pfam02551), while ORF25p comprises the cd00586 domain of 4-hydroxybenzoyl-CoA thioesterases. The products of ORF 22, ORF 23, and ORF 21 and/or ORF 25 may well be involved in anthranilate catabolism via 2-aminobenzoyl-CoA and 2-amino-5-oxo-cyclohex-1-ene-carbonyl-CoA (Fig. 1B). In A. evansii, the latter compound was proposed to be degraded in a ß-oxidation-like pathway (54). If such a ß-oxidation pathway were functional in strain Rü61a, it would require the involvement of enzymes encoded on the chromosome (or additional DNA elements), as genes that could code for enoyl-CoA hydratase/isomerase or acyl-CoA dehydrogenases were not detected on pAL1.
Carbon source-dependent transcription and operon structure of genes presumably involved in the degradation of quinaldine and aromatic compounds. RT-PCR analysis of RNA isolated from A. nitroguajacolicus Rü61a grown on different carbon sources revealed that the qoxM and amq genes and ORF 23 (Fig. 3B), as well as ORFs 3, 4, 6, 7, 8, 10, and 19 to 22 (not shown), were distinctly expressed when cells were grown on quinaldine or on aromatic intermediates of the pathway for conversion of quinaldine to anthranilate. Even if RT-PCR provided only a semiquantitative estimate of transcript formation, expression of qoxM and amq clearly was weaker in succinate-grown cells than in cells grown on quinaldine (Fig. 3B). Similar results were obtained for ORFs 3 and 10, for the qoxL, qoxS, moq, and hod genes (not shown), and for the intergenic regions of ORFs 3 to 6 and 7 to 11 (Fig. 3C; data not shown). However, the differences in transcript formation in quinaldine- and succinate-grown cells seemed to be less pronounced for the region comprising ORFs 19 to 23 (Fig. 3B and C). Transcription of ORF 16 (Fig. 3B) and ORF 24 (not shown), each of which codes for a putative PaaX-type transcriptional repressor, occurred independent of the carbon source.
|
The specific activities of Qox and Hod in extracts from cells grown on different carbon sources were determined. The level of Qox activity in the crude extract (soluble fraction) from succinate-grown cells was below the detection limit of the spectrophotometric standard assay, but minor catalytic activity was detected on a nondenaturing polyacrylamide gel stained to reveal Qox activity (data not shown). In contrast, enzymatic activity was readily detected in the soluble fractions of cell extracts from quinaldine- and 1H-4-oxoquinaldine-grown cells (Qox specific activity,
0.05 U/mg protein). The activities of the ring cleavage dioxygenase Hod in the soluble fractions of extracts from quinaldine- and 1H-4-oxoquinaldine-grown cells were about 1 to 1.2 U/mg protein, compared to <0.01 U/mg in an extract from succinate-grown cells. Strain Rü61a exhibited diauxic growth in mineral media with succinate plus either quinaldine or 1H-4-oxoquinaldine, and only the second logarithmic growth phase correlated with consumption of the aromatic substrate (data not shown), suggesting that there was succinate-mediated repression of the degradation of heteroaromatic compounds.
Transcriptional start sites of catabolic operons and putative promoters.
To determine the potential transcriptional start sites of the catabolic operons qoxLMS-ORF 3, moq-hod-amq-ORF 10-ORF 11, and ORF 19 to ORF 23, a 5'-RACE analysis was performed with RNA isolated from quinaldine-grown cells. The deduced transcriptional initiation site of the qoxLMS-ORF3 operon was located 110 nucleotides upstream of the ATG start codon of qoxL; it was preceded by a putative 10 box (CATACT), as well as a putative 35 region (TTGACG) for the binding of
70-dependent RNA polymerase (Fig. 4A). The transcriptional start site of the second operon was located 126 nucleotides upstream of the moq start codon (Fig. 4B). The TATATA and TTGACG sequences might represent a 10 region and a 35 region, respectively, of the Pmoq promoter. The putative transcriptional start site of the operon consisting of ORFs 19 to 23 was located 48 nucleotides upstream of the start codon of ORF 19; however, analysis of its upstream sequence did not reveal putative 10 and 35 elements of a
70-dependent promoter. Notably, the 10 region exhibited similarity to the 10 consensus sequence recognized by the
38 RNA polymerase subunits (15-TGCTACCTT-7; nucleotides important for
38 recognition are underlined) (Fig. 4C) (4, 35). As in most promoters that bind the
38 factor (4), a conserved 35 region was not obvious in Porf19.
|
Genes presumed to be involved in conjugation. Recently, we described conjugative transfer of pAL1 to the pAL1-deficient mutant of A. nitroguajacolicus Rü61a and to A. nicotinovorans DSM 420 (40). Bacterial conjugation is a complex process that involves DNA-processing enzymes, proteins involved in mating pair formation, and a coupling protein that guides the DNA-protein complex to a type IV secretion system (8, 37). The deduced gene product of ORF 28 of pAL1 exhibits similarity to about 30% of the pfam02534 domain conserved in coupling proteins belonging to the TraG/TraD family and the VirD4 family (COG3505). The predicted ORF 29, which overlaps ORF 28 for 52 bp, encodes a small 100-amino-acid (aa) protein. The N-terminal 47 residues of this protein exhibit 65% identity to residues 111 to 157 of a putative TraG protein of Arthrobacter sp. strain FB24 (accession no. ABK05710) and may comprise an ATP binding motif. Thus, ORFs 28 and 29 may have originally been derived from a single gene that was disrupted by a frameshift mutation.
ORF69p is related to nucleoside triphosphate binding proteins, and its sequence aligns with 52% of the sequence of COG3451 representing the conserved domain of VirB4, an ATPase crucial for assembly of the transenvelope channel and for induction of conformational changes of the translocation system required to drive transport of the DNA-protein substrate (10). ORF 84 may code for an ATP binding VirB11 component of a type IV secretion complex. The proteins encoded by ORFs 82, 85 to 89, and 91 to 92 were all predicted to contain transmembrane helices and therefore could be involved in formation of a translocation channel spanning the cell envelope. Similarities to TadB, TadC, and TadG-like proteins were found for the sequences of ORF85p, ORF86p, and ORF87p, respectively; the N-terminal regions of both ORF87p and ORF88p resemble TadE-like proteins. Tad (tight adherence) proteins are constituents of a system involved in the secretion and assembly of fimbriae (fibrils) (29, 63). The products of ORFs 82 and 85 to 89 of pAL1 thus might contribute to formation, assembly, or anchorage of a secretion pilus. The deduced ORF93p sequence exhibits 41% similarity to the sequence of TrpJ, the integral membrane type IV prepilin peptidase of Vibrio cholerae (accession no. AAK20796), and could be involved in processing of secreted type IV prepilins. In conclusion, gene clusters and individual genes presumed to code for proteins involved in conjugation and DNA transfer are somewhat scattered on pAL1, comprising ORFs 28 and 29, ORF 69, ORFs 82 and 84 to 89, and ORFs 91 to 93.
Conserved gene clusters. Conserved clustering of genes suggests a common evolutionary ancestry and perhaps a functional connection of the gene products. Three regions homologous to gene clusters previously described for other bacteria are apparent. (i) The first region includes ORFs 67 to 73, with best hits in BLAST searches with proteins encoded by linear plasmids pBD2, pREL1, pRHL2, and SCP1 of Rhodococcus erythropolis BD2, R. erythropolis PR4, Rhodococcus sp. strain RHA1, and Streptomyces coelicolor A3(2), respectively (Table 2). Additionally, three of these genes are also conserved in the chromosome of Streptomyces avermitilis MA-4680. (ii) The second region includes ORFs 82 to 89 and 91 to 92 (Table 2), presumed to code for proteins of a secretion system possibly involved in conjugation, interrupted by ORF 90 coding for a hypothetical protein without any database matches. Proteins encoded by corresponding genes on SCP1 of S. coelicolor A3(2) have been proposed to form a surface-located protein complex (5). (iii) The third region includes ORFs 74 to 79 (see Table S3 in the supplemental material); for most of the deduced gene products, functional classification was not possible. This region occurs in genomes of physiologically and phylogenetically diverse organisms.
|
The consecutive ORFs 49 and 50 encode homologs of SOS mutagenesis and repair proteins UmuD and UmuC, respectively. Residues 59 to 127 of ORF49p resemble a conserved domain of family 24 peptidases (pfam00717), which includes UmuD and its plasmid-encoded homolog MucA; aa 16 to 360 of ORF50p align with the pfam00817 domain conserved in the UmuC and MucB proteins representing DNA polymerase V (16, 47). Translesion replication in addition to UmuC and UmuD' requires RecA and a single-stranded DNA binding protein (SSB). The product of ORF 39 exhibits 29% identity to the N-terminal fragment (aa 1 to 135) of SSB from E. coli (SSBC; accession no. 1EYG) carrying the single-stranded DNA binding site (45). Thus, it might be involved in SOS repair or even in regular replication and/or telomere patching of pAL1.
Genes presumed to be involved in "telomere patching." Proteins encoded by ORFs 101 to 103 might be involved in reactions that are specific for replication at the telomeres ("telomere patching"). The closest BLAST matches for the putative gene products of ORFs 101 to 103 are rhodococcal proteins (Table 1). The large protein ORF101p appears to consist of multiple domains. Its C-terminal region (aa 1,446 to 1,663) exhibits 29% identity to residues 415 to 678 of both TapL of Streptomyces lividans (accession no. AAO73842) and TapC of S. coelicolor (accession no. AAO73843), which, however, consist of only 739 aa. Since the telomere-associated protein TapL of S. lividans binds to specific single-stranded DNA sequences of telomeric 3' overhangs of Streptomyces plasmid pSLA2 and interacts with the Tp, it was suggested that it recruits Tp to the telomere termini of replication intermediates (3). Besides its "Tap domain" (aa 1,446 to 1,663), ORF101p has a zinc finger CHCC-type domain (smart00400) at the N terminus (aa 35 to 85), and its region comprising residues 598 to 868 resembles a domain of the superfamily II helicase (COG5519). Interestingly, the putative telomere-associated protein pRL2.4c encoded by linear plasmid pRL2 of Streptomyces strain 44414 (accession no. ABC67366), which consists of 1,100 aa, also has a superfamily II helicase domain in addition to a Tap domain. The Tap-like and superfamily II helicase domains of pRL2.4c and ORF101p of pAL1 exhibit 25 and 21% identity, respectively. An additional domain of ORF101p may be formed by the region covering aa 188 to 303, which matches the DNA primase core, (i.e., its RNA polymerase domain) (SCOP accession no. SSF56731); this region seems to be not present in the two-domain pRL2.4 c protein. Genes coding for large proteins (>1,700 aa) similar to the multidomain ORF101p protein of pAL1 have been described for rhodococcal linear replicons, including pBD2.007 of R. erythropolis pBD2 (60), pREL1_0008 coding for a putative telomere binding protein of R. erythropolis PR4 (56), and RHA1_ro10009 of pRHL2 of Rhodococcus sp. strain RHA1 (accession no. ABH00202). Like ORF101p, two of these proteins contain a DNA primase domain (RHA1_ro10009 [aa 213 to 288] and pBD2.007 [aa 200 to 277]). We suggest that ORF 101 of pAL1 and its rhodococcal orthologs code for proteins that perform functions comparable to those of Tap of Streptomyces spp.; however, the additional domains may broaden their roles in the telomere patching reaction.
The deduced protein ORF102p resembles pBD2.006, RHA1_ro10008, and pREL1_0007. Remarkably, it exhibits weak but significant similarity to Tps of Streptomyces linear replicons, including 24% identity to TpgCL1 from Streptomyces clavuligerus and 25% identity to the proposed terminal protein pRL2.3c of Streptomyces sp. strain 44414 (68), suggesting that it may represent a terminal protein of pAL1. A recent study of Tp of S. coelicolor demonstrated that the 5'-terminal nucleotide of a Streptomyces linear replicon is covalently linked to a threonine residue located in the C-terminal region of the Tp (66). Note that ORF102p, like the Streptomyces Tps, contains numerous threonine residues (i.e., 17 threonine residues); however, since the amino acid residues linked to the phosphodiester bond of DNA are different in the terminal proteins of different taxonomic groups (51), the hypothesis that a Thr residue participates in the deoxnucleotidylation reaction of Tp of pAL1 is highly speculative.
Both ORF102p and ORF103p, like Tps of Streptomyces linear replicons (67), have high theoretical pI values (pI 9.79 and 10.05), and both are predicted to contain DNA binding domains at the N terminus. ORF103p is similar to the hypothetical proteins pBD2.005, pREL1_0006, and RHA1_ro10007. It is presumed to be a DNA binding protein that could be involved in replication and/or telomere patching. Multiple alignment of ORF102p and ORF103p, their homologs from Rhodococcus strains, and Streptomyces Tps revealed a short common motif, (T/V)(X)3(A/S)(X)3(G/R)(V/I)(S/T)XRT(V/I)XR (where X is any amino acid), involving conserved serine and threonine residues (underlined), located in the N-terminal, DNA binding region of the proteins. Even if Tps of streptomycetes were not among the hits when a BLASTP search was performed, the possibility that ORF 103 might code for a second Tp should not be excluded.
In conclusion, based on sequence analysis of ORFs 101 to 103 of pAL1 and the corresponding orthologs in rhodococcal linear plasmids, we suggest that these actinomycetal linear replicons encode telomere patching proteins whose primary sequence and domain structure differ significantly from the primary sequence and domain structure of proteins encoded by Streptomyces replicons. The difference in protein architecture may indicate that there are subtle differences in the telomere patching mechanisms.
Analysis of the left and right termini of pAL1. The termini of different actinomycetal linear replicons have been found to contain palindromic sequences with the potential to "fold back" to form secondary structures presumed to be functionally important for telomere patching. Assuming that pAL1 has blunt-end termini, fragments of pAL1 generated by restriction with PstI were inserted into the PstI- and EcoRV-digested vector pBluescript II SK(+), resulting in a 2.014-bp insert (pBSK3) and a 3.092-bp insert (pBSK5) corresponding to the left and right termini of pAL1, respectively. However, since previous work on actinomycetal linear replicons showed that 5' ends of protease-treated plasmid DNA may still be blocked by linkage of residual peptides to the DNA (23, 25), the blunt ends of the cloned inserts, instead of representing the end of pAL1, might have resulted from DNA shearing. Therefore, the PstI fragments were cloned again from DNA subjected to proteinase K digestion, as well as alkali treatment to hydrolyze the ester bond between the terminal nucleotide and any remnants of Tp. Four 3.0-kb inserts and five 2.0-kb inserts were identified and sequenced. Compared to the sequences of the inserts of pBSK3 and pBSK5, each of these new DNA fragments contained an additional nucleotide at the blunt-end terminus. Sequencing of these terminal fragments also revealed that the 5' nucleotide of both ends of pAL1 is dCMP, as observed for all other actinomycetal linear replicons studied so far (25, 56, 66, 68).
The first 100 nucleotides of the two terminal sequences of pAL1 exhibit a rather low level of homology (53% identity), but they contain three similar palindromic sequences, palindromes I to III (Fig. 5). Palindromes II and V of the left pAL1 terminus both have the central motif 5'-GCTGCGC-3' (Fig. 5B), which in a single-stranded 3' overhang may form a stable hairpin structure with a single C residue loop closed by sheared purine-purine (G-A) pairing (Fig. 5A). The 5'-GCTNCGC-3' motif was found to be conserved in terminal sequences of several Streptomyces and Rhodococcus linear replicons, in the termini of pCLP of M. celatum, and in the 3' ends of the genomes of autonomous (helper-independent) parvoviruses (9, 25, 30, 43, 56, 60), suggesting that it has some general relevance in protein-primed replication mechanisms. Remarkably, the two specific binding sites of single-stranded DNA on telomeric 3' overhangs of Streptomyces plasmid pSLA2, which are recognized by the Streptomyces Tap protein, include this conserved GCTXCGC motif as a core sequence (3). However, the right end of pAL1 does not exhibit similarity to terminal sequences of Rhodococcus or Streptomyces replicons (56) and lacks the 5'-GCTNCGC-3' motif (Fig. 5).
|
Concluding remarks. Since pAL1 confers the ability to degrade quinaldine to anthranilate and may code for enzymes involved in anthranilate conversion via 2-aminobenzoyl-CoA, it can be considered a catabolic plasmid. Despite the apparent lack of transposons and insertion sequences, it has a somewhat modular structure. A distinctive feature of pAL1, which apparently is shared by some rhodococcal plasmids, is the large putative telomere-associated protein that differs from streptomycetal Tap proteins, as it includes additional domains. It would be interesting to investigate the properties of this protein with respect to DNA binding, protein-protein interactions, and possible catalytic activities.
| ACKNOWLEDGMENTS |
|---|
We thank Stephan Kolkenbrock, Münster, Germany, for assistance with graphic representation.
| FOOTNOTES |
|---|
Published ahead of print on 2 March 2007. ![]()
Supplemental material for this article may be found at http://jb.asm.org/. ![]()
Present address: Centre for Microbial Diseases and Immunity Research, University of British Columbia, 2259 Lower Mall, Vancouver, BC, V6T 1Z4, Canada. ![]()
Present address: Qiagen GmbH, Qiagen Strasse 1, 40724 Hilden, Germany. ![]()
| REFERENCES |
|---|
|
|
|---|
S dependent? Role of the 13/14 nucleotide promoter positions and region 2.5 of
S. Mol. Microbiol. 39:1153-1165.[CrossRef][Medline]
/ß-hydrolase-fold protein active towards aryl-acylamides and -esters, and properties of its cysteine-deficient variant. J. Bacteriol. 188:8430-8440.