Previous Article | Next Article ![]()
Journal of Bacteriology, June 2006, p. 3923-3935, Vol. 188, No. 11
0021-9193/06/$08.00+0 doi:10.1128/JB.01953-05
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
Department of Genetics, Microbiology and Toxicology, Stockholm University, S-106 91 Stockholm, Sweden
Received 21 December 2005/ Accepted 14 March 2006
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
-proteobacteria and share common traits such as morphology, control of lytic versus lysogenic growth, and noninducibility by UV light (for a recent review, see reference 26). Temperate phages have the ability to reproduce by two alternative life cycles: the lytic or the lysogenic cycle. In the latter life cycle, the phage genome integrates into a specific location on the host chromosome, and most phage genes are turned off by the phage-encoded immunity repressor C. P2-like phages are prevalent in Escherichia coli strains; about 30% of the strains in the ECOR collection (28) contain P2-like prophages (27). P2-like phages that are found in other
-proteobacteria are more distantly related to P2 than those found in E. coli, and it seems as if the evolution of the P2-like phages tracks the evolution of their respective hosts (26; A. S. Nilsson, unpublished data). An analysis of the DNA sequence of the late structural genes of 18 P2-like isolates that grow on E. coli showed that these genes are at least 96% identical to the genes of P2 (25). Thus, these P2-like coliphages might be considered different isolates of P2, but they have been shown to have different immunities, based on their capacity to grow on bacteria lysogenized with different P2-like phages (6, 9, 14) and to integrate at at least two different sites in the host chromosome (22, 38). Phage 186 is a more distantly related E. coli phage, and its immunity repressor, cI, differs in size and sequence from the C repressors of the P2-like coliphages in this study. In fact, both cI and Apl (the equivalent of Cox) of phage 186 are more related to the cI and Cox repressors of Haemophilus influenzae phages HP1 and HP2 and Pseudomonas aeruginosa phage K139 (26) and are therefore excluded from this study. Since P2-like phages are frequently found in E. coli strains, they compete with each other when present in the same host cell, either after superinfection of a lysogen or after mixed infection. This most likely has driven evolution towards different immunities and integration sites. The transcriptional switch of P2-like phages that controls the lytic versus the lysogenic growth cycle contains two face-to-face-located promoters and two repressors, C and Cox (Fig. 1). The immunity repressor C, encoded by the first gene of the Pc promoter, blocks transcription from the early promoter Pe, leading to the formation of lysogeny. The integration of the phage genome into the host chromosome is promoted by the phage integrase, which is encoded by the int gene located downstream of the C gene. The first gene of the Pe operon is cox, whose product is a repressor of Pc. In this way, the two pathways are mutually exclusive (26). In the lysogenic stage, the immunity repressor C will also block the lytic growth of a superinfecting P2-like phage belonging to the same immunity group. Under these conditions, the superinfecting phage must integrate into the host chromosome to be stably maintained. However, if they have the same host integration site, it would lead to prophages integrated in tandem, which is an unstable state for phage P2 due to the expression of the int gene when prophages are integrated in this way (7, 8, 9). However, the superinfecting phage genome can become integrated into secondary integration sites under this condition (2, 10). In the P2-like phages studied so far, there is a coupling between the developmental switch and the site-specific recombination that leads to the integration or excision of the phage genome in or out of the chromosome. The Cox protein is, in addition to being a repressor of Pc, a directionality factor for site-specific recombination. It inhibits integrative recombination but is required for excision (39).
|
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
The P2-EC4 C gene was cloned into the pET-16b (Novagen Inc.) expression vector under the control of the T7 promoter, resulting in plasmid pEE2050. The C gene was amplified from genomic DNA of the ECOR4 isolate containing the P2-like prophage P2-EC4 by using primers EC4 C-C and EC4 C-N (Table 2). The 0.3-kb PCR product generated was cleaved with NcoI and BamHI and inserted between the NcoI and BamHI sites of the vector. Subsequent DNA sequencing was used to confirm the in-frame ligation.
|
D145 integrase gene and attP region were sequenced, and the sequences of four P2-like phages were downloaded from the GenBank database, i.e., P2, L-413C, P2 Hy dis, and W
. Nucleotide sequencing was performed by Macrogen Inc. (Seoul, Korea) or MWG-Biotech (Ebersberg, Germany). Sequence alignment and analysis. Similarity searches were performed at the website of the National Center for Biotechnology Information, where the sequences were compared to each other and to other sequences in the GenBank database with the programs ORF Finder (http://www.ncbi.nlm.nih.gov/gorf/), BLAST (http://www.ncbi.nlm.nih.gov/BLAST/), and bl2seq (http://www.ncbi.nlm.nih.gov/BLAST/bl2seq/wblast2.cgi).
The amino acid sequences for the C and the Cox proteins were aligned with the ClustalX multiple sequence alignment program (version 1.81; IGBMC, University of Strasbourg, France [ftp://ftp-igbmc.u-strasbg.fr/pub/ClustalX/]). Sequence editing was performed with the Jalview alignment editor (version 2.02; School of Life Sciences, University of Dundee, Scotland, United Kingdom [http://www.jalview.org/]), and final refinement was done manually. When nucleotide sequences were used in the phylogenetic analyses, they were aligned and edited as described above and then adjusted by hand to match amino acid alignments.
Phylogenetic analyses. The PAUP* program (version 4.0b10; Sinauer Associates, Inc., Sunderland, Mass. [http://paup.csit.fsu.edu/about.html]) was used for the phylogenetic analyses under maximum-parsimony criteria and with the heuristic search option due to the large numbers of taxa and some homoplastic characters. The program was used with all the default settings except that the stepwise addition of taxa was randomized 10 times per run, keeping the two shortest trees of each run. To assess the degree of confidence for the resulting shortest trees, they were tested in bootstrap analyses, each with 1,000 replicates.
The history of homologous recombination between taxa was evaluated with three different methods: (i) visualization of the relationship between taxa with the SplitsTree program (version 4; Algorithms in Bioinformatics, University of Tübingen, Germany [http://www-ab.informatik.uni-tuebingen.de/software/jsplits/welcome.html]), (ii) inspection of the distribution of the parsimoniously informative (PI) characters (characters present in at least two but not all taxa) that are homoplastic in the shortest trees, and (iii) statistical testing of the distribution of character differences between taxa with the program Geneconv (version 1.81; Department of Mathematics, Washington University in St. Louis, St. Louis, Mo. [http://www.math.wustl.edu/
sawyer/geneconv/]).
The SplitsTree program tree-building algorithm does not presume a bifurcating tree. Instead, it detects conflicting phylogenetic signals and tries to construct a figure that reveals these more complex relationships (1). If the program detects that recombination has occurred, affected taxa will be connected by more than one node, and the resulting tree will include one or more networks.
To visualize possible homologous recombination, the distribution of homoplasies in an aligned set of PI sites for all of the genes was assessed according to a method described previously by Nilsson and Haggård-Ljungquist (25). Since excess homoplasy is regarded as a hallmark of recombination (24), homoplasy can be used to detect regions that have undergone recombination. An uneven distribution of homoplastic PI sites signals recombination but could also be caused by rapid recent directional selection. The influence of selection was assessed with the program K-estimator (version 6.1; Genome Informatics Laboratory, Department of Biological Sciences, Indiana University, Ind. [http://www.biology.uiowa.edu/comeron/index_files/Page322.htm]) as described below.
The Geneconv program extends the methods of statistical tests for detecting gene conversions previously described by Sawyer (36). Basically, the program tests whether a pair of sequences in the alignment contains significantly longer identical or almost identical fragments than other pairs in the alignment, which results in a global probability for the actual pair. The probability is corrected for both total alignment length and the number of possible pairs, which grows exponentially in an alignment with the number of taxa. Thus, Bonferroni-corrected Karlin-Altschul P values of <0.05 were considered significant for global fragments. The program also tests whether a fragment within a pair is longer than expected in the absence of recombination. These pairwise P values are also corrected for total sequence length and were considered significant if they where below 0.01. The aligned nucleotide sequences of concatenated coding regions (start of int-C-cox-start of orf78) of all possible pairs (21 pairs in all) of the seven immunity classes were analyzed by applying the default settings, except that the mismatch penalty was set to 1 (gscale = 1). This setting allows for the possibility of mutations taking place within the recombined regions after a recombination has occurred.
The possibility that observed homoplasies were caused by selection was evaluated by comparing the number of synonymous nucleotide substitutions per synonymous site (Ks) with the number of nonsynonymous nucleotide substitutions per nonsynonymous site (Ka), which was carried out by the program K-estimator. A Ka/Ks ratio below 1 is a sign of purifying or stabilizing selection, whereas a Ka/Ks ratio of >1 indicates directional selection.
Nucleotide sequence accession numbers.
The nucleotide sequences between int and orf78 of the 34 prophages have been submitted to the EMBL databases under accession numbers AM159049 to AM159082, and the complete sequence of
D145 int can be found under accession number AM158280 (Table 1).
| RESULTS |
|---|
|
|
|---|
D160 and P2-EC12,
D145, P2-EC46 and P2-EC15 and the position of P2-EC4. The two gene trees could be largely reconciled if all of these were removed. The support for both gene phylogenies was generally good; most nodes had a 100% bootstrap support, so it seems likely that the differences in tree topology reflect a real difference in the evolution of C and Cox for these phages.
|
|
D266, HK240, HK241, 299, P2-EC7, P2-EC30, P2-EC31, P2-EC45, P2-EC48, and P2-EC58), class II constitutes the isolates with phage P2 Hy dis immunity (P2 Hy dis,
D124,
D252, HK111, HK114, P2-EC5, P2-EC10, and P2-EC67), class III has the same immunity as W
(W
, P2-EC44, and phage 18), class IV has the HK109 immunity (HK109, HK113, HK239, L-413C, P2-EC53, P2-EC59, P2-EC61, P2-EC62, and P2-EC64), class V has the
D160 immunity (
D160 and P2-EC12), class VI has the immunity of phage
D145 (
D145, P2-EC15, and P2-EC46), and class VII consists of only one single prophage, P2-EC4.
|
D145), for instance, contains three members that show a high identity to class V (78%), but they have previously been classified as belonging to different immunity groups, since
D145 was able to plate on a
D160 lysogen and vice versa (9). However, we found that the plating efficiency of
D145 on a
D160 lysogen was 10-fold reduced compared to a nonlysogen, indicating that phage
D145 has some sensitivity to the
D160 repressor. Phage
D145 also grows well on the class II HK114 lysogen but not at all on an HK111 lysogen, which belongs to the same immunity class as HK114. The reciprocal spot tests make the situation even more confusing, since HK111 produces plaques on bacteria lysogenic for
D145, with reduced plating efficiency, while HK114 does not grow at all on a
D145 lysogen. HK111 and HK114 belong to the same immunity class and cannot grow on each other's lysogens. The C protein of class VI consists of 103 or 104 amino acid residues, thus making it the longest C protein in this study. The C proteins are identical except for a short region starting at codon 23, where
D145 has a sequence of six amino acids, while P2-EC15 and P2-EC46 have a different sequence of five amino acids.
Another immunity class, the previously unreported class VII, consisted of only one member, P2-EC4. The slightly weaker bootstrap support for this position (87%) motivated a closer investigation of its integrity. The C protein of P2-EC4 has a high identity to class III and class IV, 69% and 65%, respectively. The three-dimensional structures of the C proteins are not known, but secondary structure predictions using JPRED (version 2; School of Life Sciences, University of Dundee, Scotland, United Kingdom [http://www.compbio.dundee.ac.uk/
www-jpred/]) indicate that they contain at least three
-helices, where helix 2 and helix 3 most likely constitute a DNA binding helix-turn-helix (HTH) motif since they are separated by a conserved glycine residue. In this case,
-helix 3 should be the DNA recognition helix, which is supported by the fact that only the last two amino acids in this helix are conserved. Since the C proteins of P2-EC4 and W
differ in only one out of the eight amino acids in
-helix 3 (Fig. 3), the P2-EC4 C gene was cloned (pEE2050), and the capacity of W
to plate on bacteria containing plasmid pEE2050 or plasmid pEE900 (containing the W
C repressor) or on the nonlysogenic strain C-1a was determined. W
plated with the same efficiency on C-1a, and C-1a containing pEE2050, but formed no plaques on C-1a containing pEE900. The same pattern was shown when the plating efficiency of HK109 was assayed. HK109 formed plaques at the same frequency on the nonlysogenic strain C-1a and the construct with the pEE2050 plasmid but was unable to form any plaques on a C-1a strain lysogenic for HK109 (strain TD204). This implies that the C repressor of P2-EC4 is unable to block transcription from the Pe promoter of W
as well as HK109.
Localization of presumptive early promoters and C operators.
In phages P2, P2 Hy dis, and W
, the strong early Pe promoters and the C operators have been located (22, 30) (Fig. 4). In these phages, the initiation codon of the C protein and the 35 region of Pe are either overlapping, as in the case for P2 and P2 Hy dis, or located back to back. The C operators have been shown to consist of two directly repeated sequences located on either side of the 10 region (P2 and W
) or of the 35 region (P2 Hy dis). Since the early promoters are expected to be strong, we have searched the equivalent regions of class IV to class VII, and presumptive promoters can also be found at similar positions in these classes (Fig. 4). A search for direct repeats spanning the 10 or 35 region of Pe revealed directly repeated sequences of 10, 8, 9, and 5 nucleotides (nt) spanning the 10 regions in class IV, V, VI, and VII, respectively. As expected, the sequences of these presumptive operators differ between the classes.
|
|
The investigation of the distribution of informative characters between pairs of immunity classes showed several cases of a long run of homologous characters intervened by sections consisting almost exclusively of homoplastic characters (data not shown).
The statistical analyses for detecting and estimating the extent of recombination, using the Geneconv program, showed that none of the immunity classes studied had been unaffected by recombination. Five of the 21 pairwise tests resulted in statistically significant apparent recombinational events. All of the classes seemed to have experienced at least one instance of recombination, and the detected breakpoints for recombination were all within coding regions (Fig. 6). The program did not give any statistical support for recombination events with genes other than those contained in the data set, i.e., recombination described as resulting in "outer fragments" in Geneconv.
|
To detect whether any selective forces were acting on the C and Cox proteins, the ratio of synonymous substitutions per synonymous site (Ks) to nonsynonymous substitutions per nonsynonymous site (Ka) of the nucleotide sequences was calculated. The Ka/Ks ratio for C was calculated to be 0.34 (standard deviation, 0.13), and the Ka/Ks ratio for cox was calculated to be 0.45 (standard deviation, 0.30). The predominance of synonymous substitutions could be interpreted either as purifying selection leaving only neutral changes or that the directional selection took place such a long time ago that synonymous substitutions have had the time to accumulate and that the signal is no longer distinguishable.
The regulatory intergenic region.
The length of the DNA sequences between C and cox genes in the different immunity classes varies from 92 nucleotides (
D160) to 162 nucleotides (HK109), and the sequences differ to such a high degree that alignments are impossible. Comparison of the intergenic regions within immunity classes is possible, though, and generally show a high identity, with the lowest score being 98%.
Since the Pc promoters in P2, P2 Hy dis, and W
are weak, with a low identity to the consensus E. coli promoter (22, 30, 32), Pc promoters of phages from other immunity groups may also have weak promoters. A search of the intergenic region for possible Pc promoters gave no strong indications of possible promoters. Thus, their location remains to be determined.
The Cox protein of the P2-like phages analyzed couples the control of the transcriptional switch with site-specific recombination, since it acts as a repressor of Pc and as a directionality factor during site-specific recombination. Thus, Cox should bind in the vicinity of Pc and within attP. In the case of P2 and P2 Hy dis, which integrate into the same attachment site, the Cox proteins are interchangeable, even though
-helix 2, believed to be involved in DNA recognition, differs in three out of eight amino acids (22) (Fig. 5). W
has the same integration site as HK109, and a direct repeat (CCTAGAA[A/G]GGAC), located upstream of the Pc promoter, has been implicated as the recognition sequence of Cox. This sequence cannot be found in the intergenic region of HK109; only a subset of it, AGAA, can be found. Instead, a repeat of 6 nt (TTTGAG) is located in this region. Thus, it is possible that the Cox proteins of W
and HK109 are noninterchangeable.
D160 and
D145 integrate at the same attachment site (see below), and, as can be seen in Fig. 5, they have identical amino acid sequences in
-helix 2. Thus, their Cox proteins are expected to recognize the same DNA sequence. A directly repeated sequence of 10 nt can also be found in the intergenic regions of both phages (data not shown) and in the attP region of
D145 (Fig. 7A) with the consensus sequence G4G5C6T6C5T4A4G5T4T6, taking all six repeats into account.
|
and P2 have previously been identified in E. coli (2, 22). In the sequenced K-12 strain MG1655, P2 integrates at 2,165.2 kb (46.6 min), and W
integrates at 4,104.4 kb (88.6 min). In the case of P2, the preferred integration site, locI, has been identified in E. coli strain C since this site is occupied by a defective prophage in K-12, leading to integration at several secondary sites. Using primers located on either side of locI, and primers within the phage genomes, we could show by PCR that all phages belonging to immunity class I and class II were integrated at locI. None of the others were integrated at this location, named site 1 in Fig. 3. PCR amplification with primers on either side of the integration site of W
, and with primers within the phage genomes, showed that all phages belonging to class III and class IV integrate at this site, denoted site 2 in Fig. 3. This left six phages and prophages with unknown integration sites, namely, those of class V, class VI, and class VII.
In order to identify these chromosomal integration sites, the DNA sequence of the attP region of phage
D145 was determined after PCR amplification using primers designed from well-conserved regions of the completely sequenced genomes of P2, W
, and L-413C. The primers were placed in the ogr gene, located to the left of attP, and in orf78, located downstream of cox, i.e., on the right side of attP (Fig. 1). The sequence between the ends of ogr and int are shown in Fig. 7A. The P2 integrase is heterobivalent. It has two DNA binding motifs that recognize different DNA sequences, the arm-binding sites, which are present as two direct repeats on either side of the core, and the core binding motif that recognizes an imperfect inverted repeat in the core. The arm-binding motif is located at the N terminus of the integrase (16), and the W
integrase has a similar arm-binding motif, since they recognize similar arm sequences (22). Similar repeats were also found in the presumed attP region of
D145, which indicated that they constitute the arm-binding sites (P1 and P2, P'1 and P'2). A comparison between the 12 direct repeats in the arm sequences of P2, W
, and
D145 showed that they were highly similar and that the consensus sequence was T12G10T12G12G12A10C12A8.
Furthermore, an IHF binding site is located to the left of the core in the P2 attP site, and since a potential IHF binding site can be found to the left of an imperfect repeat in
D145, it may constitute the core sequence (Fig. 7A). A GenBank search for a bacterial sequence with similarity to this hypothetical
D145 core gave no hits, but a search for similar integrases potentially integrating within the same core sequence resulted in a candidate integrase. The integrase of the Erwinia carotovora subsp. atroseptica prophage
ECA29 (3) was found to be 73% identical to the integrase of
D145, and there was a high degree of similarity between the putative
D145 and
ECA29 core regions.
PCR amplifications of three different E. coli
D145 lysogen genomes with primers designed from the E. carotovora DNA flanking the
ECA29 prophage and primers from within phage
D145 DNA all generated identical PCR fragments. Sequence analyses of the prophage junctions confirmed that
D145 integrates into E. coli at the equivalent location as
ECA29 in Erwinia, namely, into the very beginning of the pflA gene. In fact, the attB sequence contains the start codon of pflA, and the integration of
D145 disrupts the gene. The pflA gene codes for an enzyme, pyruvate formate-lyase-activating enzyme, that is involved in the fermentation of pyruvate in microaerobic conditions (40).
The attB sequence, corresponding to
D145 attP of the nonlysogenic strain E. coli C-1a, was determined after PCR amplification using primers on either side of the presumed attB sequence. Analysis of the sequence showed that this region is identical to E. coli K-12 strain MG1655 (11). As shown in Fig. 7B, the
D145 core sequence and the attB sequence share an identical region of 13 nt, but the sequence similarity can be extended if mismatches are tolerated. The identical region should correspond to the region where branch migration occurs during recombination, while the imperfect inverted repeats should constitute the integrase recognition sequence. As can be seen in Fig. 7B, the imperfect repeat makes it clear that the recombination has occurred within the 13-nt region of identity.
The center of the attB region lies between positions 950293 and 950294 in the MG1655 map, but the region should contain at least 20 nt centered around this position if the inverted repeats are recognized by the integrase.
D145 integrates at the 5' end of the pflA gene, and the crossing over occurs between positions +3 and +15 from the start codon of the gene. This creates a truncated gene at attR that codes for a peptide of eight amino acids. After that, there are two inverted repeats; one of them is the ogr/int terminator that prevents transcription from the bacteria to continue into the prophage or vice versa. In the attL region, the start codon of pflA is substituted by an AAG codon that codes for lysine. In the same frame, but 60 nt upstream of the lysine codon, there is a start codon. This ATG codon is preceded by a potential ribosomal binding site. In the equivalent region of
ECA29 and Erwinia, there is also an in-frame start codon at the same distance, but the N-terminal additions to PflA show low similarity between
D145 and
ECA29. This implies that the pflA gene is on the same transcript as the int gene and that the 20 additional amino acids at the N terminus seem not to affect its biological activity.
By using the same set of primers as those used for analyzing the integration site of
D145,
D160, P2-EC12, P2-EC15, and P2-EC46 were found to use the same integration site as
D145. However, this site is empty in ECOR4, and the integration site of P2-EC4 thus remains unknown.
| DISCUSSION |
|---|
|
|
|---|
-proteobacteria seem to have the same phylogenetic relationship. This phylogeny resembles the tree of their different
-proteobacterial hosts, which indicates that P2-like phages coevolve with their hosts (Nilsson, unpublished). What then is the biological significance of having different immunities and integration sites? Since P2-like phages are commonly encountered as prophages in E. coli, superinfections of a lysogen, or possibly mixed infections, can be expected to be frequent events in nature. The outcome of such infections will depend on whether the phages have the same or different immunities and whether they share the same or have different host attachment sites. If a P2 lysogen is superinfected with another P2-like phage with the same immunity and attachment site, it will be repressed and must integrate to be stably maintained in the cell. However, P2 tandem double prophages are unstable due to int expression (7, 8), so the superinfecting phage will either integrate at secondary sites, which could affect its capacity to excise upon induction, or replace the resident prophage. Consequently, gaining a different immunity should be an advantage under these circumstances, since the superinfecting phage would be able to enter the lytic cycle. This is also in accordance with our observation that the immunities have diverged within the different integration site classes. When the regular integration site is occupied, there ought to be strong selection for a new immunity. But this is in conflict with the fact that there must be plenty of phages with different immunities around, even though they integrate at other sites, and there may be no need for developing a new immunity.
We have observed that immunity seems to be complete between members of the same immunity class, but we have also observed that it is sometimes incomplete or nonreciprocal between phages from different groups, like the immunity of the class VI phage
D145. There are several possible explanations for this. In uniform and infinitely large populations of phages and hosts, immunity would be expected to evolve through frequency-dependent selection and to result in a multitude of phages with different immunities. However, phages form classes containing many phages with the same immunity, and the number of immunity classes is limited. Consequently, an unknown spatial clonal variation of bacterial hosts, each harboring phages from all immunity classes and integrating at all possible locations, might exist. According to this hypothesis, phage
D145 should be found in host populations that do not contain
D160 or HK111, which are able to block transcription from their Pe promoters. It is also difficult to explain the coexistence of
D145 and HK114. Phage
D145 should be able to outnumber and possibly eradicate HK114 since it can use that lysogen for its lytic growth, and HK114 cannot grow on a
D145 lysogen. One explanation would be that they never meet. Another hypothesis is that the fitness of the phage is not dependent on immunity only. The number of possible hosts, lysogenic or not, that a phage with a specific type of immunity can use may be limited, but this could be balanced by lysogenic conversion genes that increase the fitness of the host and, indirectly, the phage as well. A third hypothesis is that immunity is a characteristic that is under such evolutionary constraint that the rate of change is slow. Expression of immunity involves both the C protein and the operators to which it binds, and there are many possible outcomes of a change in either of these. The
D145 and
D160 C proteins show 78% identity at the amino acid level (Table 3). They differ by only two amino acids in the third
-helix, which is believed to be the part that recognizes the operators. It is not surprising that this small difference affects immunity. Some phages may express an immunity that is "under construction" but sufficient under the actual circumstances.
The disagreement of the phylogenetic analyses of C and Cox together with the results of the analyses for detecting recombination performed in this work support homologous recombination as the causative agent for the generation of new immunity classes. P2 Hy dis is such an example. P2 Hy dis was obtained after P2 infection of E. coli B, containing a cryptic prophage with dis immunity (6, 13). Later, three nonhomologous regions between P2 and P2 Hy dis were identified by electron microscope heteroduplex mapping (12). Whether P2 Hy dis was generated by one or several recombination events is not known, since the cryptic prophage has not been sequenced. However, a comparison of the DNA sequence of the int gene regions of phages P2 and
D266, belonging to immunity class I, with those of
D252 and P2 Hy dis, belonging to immunity class II, shows a high level of identity. The variation is only between 0.5 and 1% at the nucleotide level (unpublished data). Thus, homologous recombination between the int genes and gene B between phages of immunity groups I and II would generate the same heteroduplex loop in this region as that obtained with P2 and P2 Hy dis.
Considering the clonal evolution of P2-like coliphages, the question is from where the different immunities originated. One possible scenario is that a phage with a different host preference occasionally infects E. coli, allowing recombination between homologous regions with a resident P2-like prophage that will gain a new immunity and higher survival fitness when competing with other P2-like coliphages. Since a survey of immunities of P2-like phages in other enterobacteria has not been performed, this hypothesis cannot be validated.
The evolution of new phage attachment sites is more complex due to the complex structure of attP. The heterobivalent Int protein binds not only to the core sequence but also to the arm sites located on each side of the core, excluding recombination between an incoming phage and a prophage as a possibility for gaining new site preferences. In addition, the Cox binding sites located between the core and one of the arm sites must be compatible with the binding sites in the transcriptional switch. The generation of new site preferences may therefore occur by two routes. P2 saf is a P2 mutant with an altered site preference that has a single-base-pair substitution within the core sequence (37, 38). P2 saf was isolated from a P2 prophage located at secondary integration site II (not to be confused with regular integration site 2 in this paper) and is an example of how new attachment sites may evolve after integration and excision from secondary attachment sites. Under these conditions, the recombinant phage should have the same immunity as before, but it should have a new site preference. Alternatively, new attachment sites could be generated by homologous recombination between two phage genomes present simultaneously in the same cell, where one phage originates from a different bacterial host. Since the Cox protein has a dual function, the recombination event must include the whole attP, int, C, and cox region, possibly leading to a change in attachment site and immunity at the same time. A possible example of such a recombination event is the Erwinia prophage
ECA29, which integrates at the same location as the P2-like coliphages of class V and VI, but since then, new recombination has occurred, giving different immunities. It is also interesting that
D145 integrates into the coding part of a gene. It integrates into pflA and provides the 5' sequence that changes the start codon to an AAG and restores the 5' end of the pflA gene, as opposed to most genetic elements, which restore the 3' end of the disrupted gene. The core region and attB of
D145 have some mismatches, in contrast to P2 and W
, where their att sites are identical. As with the immunity, the integration of
D145 indicates differences that could be interpreted as imperfect and evolution on the way.
The fact that we have not found any P2-like phages belonging to the same immunity group that have different attachment sites is intriguing, since under laboratory conditions, P2 triple lysogens can be established where P2 is integrated at secondary sites (23). It is possible that phages having that same immunity but different attachment sites are not as competitive and are therefore removed by selection.
| ACKNOWLEDGMENTS |
|---|
D160 and Georgina Ibrahim Isak for providing unpublished observations. This work was supported by grants from the Swedish Research Council and the Sven and Lilly Lawski Foundation.
| FOOTNOTES |
|---|
| REFERENCES |
|---|
|
|
|---|
isolated from Escherichia coli strain W. Genet. Res. Camb. 9:135-139.
, a P2-related but heteroimmune coliphage. J. Virol. 73:9816-9826.This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Appl. Environ. Microbiol. | Infect. Immun. | Eukaryot. Cell |
|---|---|---|
| Mol. Cell. Biol. | J. Virol. | Microbiol. Mol. Biol. Rev. |
| ALL ASM JOURNALS |