Previous Article | Next Article ![]()
Journal of Bacteriology, March 2008, p. 2172-2182, Vol. 190, No. 6
0021-9193/08/$08.00+0 doi:10.1128/JB.01657-07
Copyright © 2008, American Society for Microbiology. All Rights Reserved.
,
Laura J. Marinelli,
Deborah Jacobs-Sera,
Roger W. Hendrix, and
Graham F. Hatfull*
Department of Biological Sciences and Pittsburgh Bacteriophage Institute, University of Pittsburgh, Pittsburgh, Pennsylvania 15241
Received 12 October 2007/ Accepted 22 December 2007
|
|
|---|
|
|
|---|
The evolution of mosaic genomes requires the creation of novel junctions at module boundaries. These module junctions often correspond closely with gene boundaries, and two distinct models have been proposed to explain this mosaicism. The first was proposed by Susskind and Botstein (34) and supposes that there are short conserved sequences at gene boundaries that act as recombination targets. Such boundary sequences have been described in some Escherichia coli phages (6), but these appear to be exceptional rather than typical examples. The second model proposes that module junctions are a result of illegitimate recombination events that occur without DNA sequence identity (beyond a few nucleotides in length), accompanied by selection for gene function and genome length of a packageable size (15). While such events are likely to be rare, the large population size, a long evolutionary history, and active phage growth (estimated at a global rate of 1024 infections/second) (16) provide ample opportunity for such low-frequency events to play key evolutionary roles. Moreover, this is a creative process generating novel genetic junctions that are expected to remain stable within the population for long periods (i.e., as molecular fossils) but that can be further moved around by homologous recombination between shared gene sequences.
The absence of shared sequences at most presumed module junctions suggests that the boundary sequence model is unlikely to be generally applicable for creating new module junctions. However, there is also only rather poor direct evidence in support of the illegitimate-recombination model, since there are few examples of recombination events that have occurred sufficiently recently in evolutionary time for us to be able to interpret where the recombination events took place. One plausible example of such an event has been described among mycobacteriophage genomes (28), where a short DNA sequence of near identity is shared by two otherwise dissimilar genomes and the recombinant junctions can be inferred. However, the illegitimate-recombination model also suggests that genetic exchange would occur between the host and phage genomes (since extensive homology is not required) and that the comparatively large size of bacterial chromosomes would make these relatively frequent events. While "bacterial" genes are indeed commonly found in phage genomes, there are few examples that we are aware of where the mechanism of acquisition has been described, apart from the classic incorporation of gal or bio genes in the generation of lambda specialized transducing phages (4).
In this paper, we describe a novel mycobacteriophage (Giles), a distant relative of other phages that infect the same Mycobacterium smegmatis host, and its comparative genome analysis. While more than 30 mycobacteriophage genomes have been sequenced (12, 31), making them, together with groups of phages that infect dairy bacteria (3), staphylococcus (21), and pseudomonas (22), one of the largest groups of bacteriophages that infect a common host, mycobacteriophage Giles has a unique genomic architecture and is highly mosaic, and more than 50% of its putative genes are novel. Among the many interesting features of the Giles genome, two aspects are particularly illuminating in the process of phage evolution. First, Giles carries an active integration cassette that is located in an atypical position within the structural gene operon, suggesting that site-specific recombination events can contribute to noncanonical genome structures. Secondly, Giles carries a short DNA segment at the right end of the genome that is 100% identical to the host metE gene, providing direct evidence for the role of illegitimate recombination between phage and host genomes in generating mosaic genomic architectures.
|
|
|---|
Sequence determination. To prepare a library of Giles, genomic DNA was sheared using HydroShear (Gene Machines, Inc.) and repaired, and 1- to 3-kbp DNA fragments were purified by gel extraction. The DNA fragments were cloned into the EcoRV site in the pBluescript vector and recovered in Escherichia coli XL1 Blue cells. DNAs from approximately 480 individual clones were prepared and sequenced from both ends of each insert using forward and reverse primers. The sequences of these clones were assembled into a single contig, and ambiguous regions were resolved by sequencing directly from Giles DNA with oligonucleotide primers.
Construction of Giles integration-proficient vectors. Giles integration-proficient plasmid vectors, pGH1000A and pGH1000B, were constructed by PCR amplification of a 1.7-kbp fragment of the Giles genome using primers 5'-TGACGATCAACTCCGCGGGGCCGGGCCA and 5'-GGAATGATGATCGCCGCGGTGACACAATCGGCG and cloning this fragment into the DraI site of plasmid pMosBlueHyg, a derivative of pMosBlue in which a hygromycin resistance cassette had been inserted. Electroporations of M. smegmatis mc2155 with pGH1000A and pGH1000B were performed with approximately 50 ng DNA.
Identification of Giles virion proteins. CsCl-banded Giles virions were dialyzed into phage buffer and subjected to three rounds of freezing on dry ice and thawing on wet ice. This sample was mixed with sodium dodecyl sulfate (SDS) sample buffer and heated to 95°C for 3 minutes. Proteins were separated by SDS-polyacrylamide gel electrophoresis (PAGE) on a 10% gel and visualized by staining them with colloidal Coomasie (Invitrogen). Bands were excised and sequenced using NanoLC-MS/MS peptide-sequencing technology (ProTech, Inc.). Briefly, protein bands were in-gel digested with sequencing grade modified trypsin (Promega). The peptide mixtures were then analyzed by a liquid chromatography-tandem mass spectrometry system (Thermo), consisting of high-pressure liquid chromatography with a 75-µm-inner-diameter reverse-phase C18 column on-line coupled to an ion trap mass spectrometer. The spectrometric data were searched against a database consisting of all predicted Giles open reading frames, and the output was manually verified. When not otherwise indicated, all other chemicals used in proteolytic digestion and high-pressure liquid chromatography were obtained from Sigma. N-terminal sequencing confirmed the identities of the phage-encoded proteins gp10 and gp15.
Nucleotide sequence accession number. The GenBank accession number of the Giles genome is EU203571.
|
|
|---|
![]() View larger version (91K): [in a new window] |
FIG. 1. Morphology of mycobacteriophage Giles virions. Shown is an electron micrograph of mycobacteriophage Giles particles negatively stained with uranyl acetate. Scale bar = 100 nm.
|
Genome organization of mycobacteriophage Giles. Annotation of the Giles genome revealed 79 open reading frames and no tRNA or other small RNA genes (Fig. 2 and Table 1). Remarkably, only 31 of these putative open reading frames (39%) have identifiable sequence similarity to other mycobacteriophage genes at the amino acid sequence level (Table 1), although at least two that do not match other mycobacteriophage genes have other database matches. There is an easily identifiable gene encoding a tyrosine integrase (gp29) located near the center of the genome and transcribed leftward, and the adjacent gene on its right, which is transcribed rightward, encodes a putative excise protein. As with other integrase-containing phage genomes, we will refer to the segment to the left of the integration cassette as the "left arm" and the genes to the right as the "right arm" (Fig. 2). Giles is unusual among the mycobacteriophages in that virion structural genes are present in both the left and right arms (see below).
![]() View larger version (20K): [in a new window] |
FIG. 2. Organization of the mycobacteriophage Giles genome. The Giles genome is represented by horizontal lines, with putative genes shown as boxes above (transcribed rightward) or below (transcribed leftward); the number of each gene is shown within its box. All genes have been sorted into phamilies (Phams) of related sequences using the computer program "Phamerator" (S. Cresawn, R. W. Hendrix, and G. F. Hatfull, unpublished data); the phamily number is displayed above each gene, and the boxes are color coordinated accordingly. Note that the Pham numbers differ from those described previously (12). Putative gene functions are noted. The positions of putative transcription promoters as identified by DNA Master (http://cobamide2.bio.pitt.edu) are shown as arrows.
|
|
View this table: [in a new window] |
TABLE 1. Coordinates of mycobacteriophage Giles genes
|
300 bp to the left of the integrase gene (which, as described below, contains the attP site),
300 bp to the right of the excise gene (30),
400 bp between genes 61 and 62, and 520 bp between the last gene (79) and the right end of the genome (Fig. 2). While accurate promoter prediction remains challenging (11), a promoter subset with sequence similarity to the
70 promoters of E. coli are recognizable and have been shown to be important in other mycobacteriophages (27). We have identified four such putative promoters in the Giles genome, two that are linked to leftward-transcribed genes at the left end of the genome and two at the extreme right end, one facing leftward and the other rightward (Fig. 2). All three of the leftward-facing promoters are of interest in that they suggest that they, along with their accompanying open reading frames, are parts of morons (18) and have been acquired relatively recently. Giles virion structure and assembly genes. One of the most striking features of the Giles left arm is that of the 28 putative genes, only 12 are clearly related to those of other mycobacteriophages (Fig. 2). These include genes encoding a large terminase subunit (gp5), a prohead protease (gp7), the major capsid subunit (gp9), the tail tape measure protein (gp20), and five minor tail proteins (gp21, gp22, gp24, gp25, and gp27). From their genomic positions, we also predict that genes 4, 6, and 8 encode a putative small terminase subunit, a portal protein, and a capsid assembly protein, respectively. Curiously, no sequence relationship between the putative Giles capsid subunit (gp9) and any other mycobacteriophage capsid proteins could be detected, although after several rounds of PSI-BLAST analysis, matches to capsid proteins of several streptococcal and staphylococcal phages were detected. This assignment was confirmed by analysis of Giles virion proteins (Fig. 3) showing that gp9 is one of the most abundant structural components; we note, however, that the gp9 subunits are not covalently cross-linked, as is common among other mycobacteriophages (9, 10, 13, 25). Among the 31 previously sequenced mycobacteriophage genomes, putative capsid subunits can be identified in 24, and they fall into four distinct sequence families (12). The Giles capsid gene thus further expands the diversity of this group of proteins.
![]() View larger version (45K): [in a new window] |
FIG. 3. Virion proteins of mycobacteriophage Giles. Purified Giles particles were denatured, and the proteins were separated by SDS-PAGE. Markers (M) are designated by their masses in kDa. Putative assignments of bands to Giles gene products were determined by mass spectrometry and N-terminal sequence analysis of individual bands.
|
Twelve Giles virion proteins (gp6, gp9, gp10, gp12, gp15, gp20, gp21, gp22, gp24, gp25, gp27, and gp36) were identified by mass spectrometry following SDS-PAGE separation (Fig. 3); some peptides from both gp8 and gp16 were identified in the gp15 and gp10 samples, suggesting that they are also virion proteins. The second most abundant protein after the putative capsid subunit gp9 appears to be gp15, which therefore is a good candidate for the major tail subunit, and PSI-BLAST analysis revealed that it is a distant relative of the putative major tail subunit of mycobacteriophage Che9c gp14. In all other genomes of noncontractile tailed mycobacteriophages, the genes between the major tail subunit and the tape measure protein include a pair of overlapping open reading frames that are expressed via a programmed translational frameshift (12); this is one of the most highly conserved features of double-stranded DNA tailed phages (37), and it results in the production of assembly chaperones for the tail. In Giles, there are four genes (16, 17, 18, and 19) between the major tail subunit and the tape measure protein gene rather than the more usual two genes, making it unclear a priori which two genes might be participating in a frameshift. However, gp17 shows weak but significant sequence similarity to genes occupying the second position of the frameshifting pair in other mycobacteriophages (Bxb1 gene 21, Bxz2 gene 25, and Llij gene 13), so we tentatively identified Giles genes 16 and 17 as the gene pair that participates in programmed translational frameshifting in the phage. The two open reading frames overlap and are related in such a way that a +1 frameshift would be required for the ribosome to shift between them. We did not find any clear candidates for a "slippery sequence" in the overlap region, though we note that such sequences are less apparent than the signals for the more commonly encountered –1 frameshifts. We did find a "Shine-Dalgarno-like" sequence and strong potential for an RNA pseudoknot, both in the immediate vicinity of the end of gene 16 (Fig. 4). Such sequence features have been implicated in potentiating frameshifting in other systems (2), including two cases of +1 frameshifting at the ends of structural genes in the Listeria phage PSA that are associated with pseudoknots of somewhat different topology from what we see here (38).
![]() View larger version (10K): [in a new window] |
FIG. 4. Predicted pseudoknot at the end of the gene 16 mRNA. Bioinformatic analysis suggests that there is a programmed +1 translational frameshift near the end of gene 16 that would shift ribosomes translating gene 16 into the gene 17 reading frame (see the text). We found the illustrated potential pseudoknot and Shine-Dalgarno-like ("S-D") sequence in the immediate vicinity of the end of gene 16.
|
Among the other genes in the 38-to-79 segment of the Giles genome, the functions of only two additional genes can be deduced. Giles gene 62 encodes a putative DNA methylase that is likely part of a restriction-modification system, and while a putative restriction gene is not obvious, gene 61 is a possible candidate. Giles gene 68 encodes a WhiB-like regulator protein, and although the specific function of gp68 is not clear, we note that whiB-like genes are quite prevalent among the mycobacteriophages, and four of the nine whiB-containing genomes contain two copies. Giles gene 79 encodes a protein with similarity to MetE and is discussed in further detail below. Few of the genes in the 38-to-79 segment are related to other mycobacteriophage genes, but those that are identifiable are related to different genomes, consistent with a mosaic genome architecture (Fig. 2).
Is Giles a temperate phage? Giles forms plaques on lawns of M. smegmatis that are neither absolutely clear (for example, D29 [9]) nor as turbid as those of the obvious temperate phages, such as L5 and Bxb1 (Fig. 5A) (13, 17). Many mycobacteriophages are in this category and apparently do not form lysogens of M. smegmatis at high frequency (12). However, many of these, like Giles, have an integration cassette, suggesting that they either are competent to form lysogens (albeit at a reduced frequency in M. smegmatis) or possibly are recent derivatives of temperate parents that have lost their lysogenic functions during the isolation procedures; it is also possible that they efficiently lysogenize bacterial hosts other than M. smegmatis.
![]() View larger version (81K): [in a new window] |
FIG. 5. Analysis of Giles lysogens of M. smegmatis. (A) Top agar lawns were prepared with either a putative Giles lysogen or nonlysogenic mc2155 (as indicated), and 5 µl of serial dilutions of phage L5 or Giles was spotted onto the lawn. Giles lysogens are immune to superinfection by Giles, and killing is only observed at the highest phage dose (corresponding to 1010 PFU). (B) A Giles lysogen was similarly tested for immunity to a collection of other mycobacteriophages, all of which infect the lysogen and nonlysogen similarly. A key to the phages tested is shown below.
|
To determine the frequency of lysogenization under a set of standard conditions, we determined the proportion of bacterial colonies recovered following the plating of dilutions of an M. smegmatis culture on solid media seeded with 109 Giles particles. Using dilutions that yielded approximately 2,000, 200, and 20 colonies on control medium, colonies were recovered at a frequency of 2%, reflecting the frequency of lysogenization. It has been shown previously using similar assays that L5 lysogenizes at a frequency of about 22% but that a mutant with a clear plaque morphology lysogenizes at only 7% (7, 33). Thus, even this L5 mutant lysogenized at a higher frequency than Giles. The molecular basis for the low frequency of Giles lysogenization is not clear.
While phage integration functions are relatively easy to identify in most phages that encode them, the immunity functions are diverse. For example, while the repressors of L5 and Bxb1 have been identified experimentally (7, 17), and related copies are present in mycobacteriophages Bxz2, Che12, U2, and Bethlehem (12), there are no identifiable homologues in any other phage or bacterial genome sequences. Examination of the Giles genome revealed no homologues of the L5/Bxb1 repressor group or of any other phage repressors (Fig. 2). An intriguing candidate, and the only gene related to other transcriptional regulators, is gene 68, which encodes a WhiB family protein; moreover, this gene is located in a region (i.e., 10 to 20% of the genome length to the left of the right end) where phage repressors are commonly positioned. However, repressor genes are typically associated with closely linked upstream regulatory sequences, and the nearest likely location is more than 3.5 kbp away, in the gene 61-to-62 noncoding region (Fig. 2). Interestingly, there are WhiB family members in a number of other mycobacteriophage genomes (e.g., Tweety, Llij, Che9d, PMC, and Che8), many of which also have integration cassettes.
The unusual location and organization of the Giles integration cassette. Sixteen of the previously characterized mycobacteriophages have an integrase gene that is typically located close to the center of the genome and separates the structural genes to the left (i.e., the left arm) and the nonstructural genes to the right (i.e., the right arm) (12); in all of these genomes, the lysis functions are encoded to the left of the integrase gene. Giles represents a significant departure from this pattern, since the group of seven rightward-transcribed genes to the right of the excise gene (30) encode the lysis functions (genes 31 and 32) (Fig. 2), and genes 34 and 35 are homologues of genes found in the structural operons of mycobacteriophages Halo, PG1, Cooper, and Orion (12); in Halo, genes 30 and 31 (homologues of Giles 34 and 35, respectively) are situated immediately to the left of the integrase gene. Furthermore, the product of Giles gene 36 (gp36) is likely a structural protein and is present in Giles virions (Fig. 3).
A simple interpretation of the Giles organization is that the integrase/excise cassette has relocated to a position within the "left arm" of the genome so that it now sits amid the structural genes (Fig. 6). This organization is additionally puzzling, since a putative rightward transcription terminator is located between the attP core and the integrase gene (see below), so that the 30-to-37 group of genes must be expressed either from an additional promoter or via an antitermination mechanism. It is unclear what mechanism gave rise to the repositioning of the integration cassette, although it is plausible that it occurred by an integrase-mediated event utilizing a secondary attachment site within the left arm rather than by an illegitimate-recombination process that could also give rise to other phage rearrangements (15, 28). This raises the questions of whether the integration cassette is functional and whether the lysogens described above contain an integrated prophage.
![]() View larger version (21K): [in a new window] |
FIG. 6. Positioning of integration functions in the Giles structural operon. The Giles integration functions are atypically positioned among structural protein genes and the lysis genes. Shared Giles and Halo genes are joined by red shading, and known Giles virion proteins are labeled. While Halo and Giles are not globally related, the alignment illustrates the typical location of the integration cassette at the end of the structural gene operon in Halo, with the lysis and structural genes all to the left of int. In Giles, the integration cassette has moved into an atypical position within the structural operon, with the lysis genes to its right. Pham designations are shown above the genes as described in the legend to Fig. 2.
|
![]() View larger version (22K): [in a new window] |
FIG. 7. Integration functions of mycobacteriophage Giles. (A) Nucleotide sequence of the intergenic region between Giles genes 28 and 29 (coordinates 24990 to 25285) that includes the attP site (Fig. 2). The 46-bp common core shared with the M. smegmatis genome is shown as a thick horizontal line with the single base difference boxed; the coordinates of the common core in Giles are 25134 to 25179; putative arm-type integrase binding sites (P1 to P4) are also shown. Located between the common core and the right arm-type binding sites is a putative stem-loop terminator for rightward transcription. (B) Structure of M. smegmatis tRNAPro (Msmeg_3734), which includes the Giles attB site with an arrow indicating the position corresponding to the 5' position of the common core. The tRNA position that changes following Giles integration as a consequence of the base difference is circled. (C) Organization of integration-proficient plasmid pGH1000A and its integration into the M. smegmatis chromosome. (D) PCR amplification of attachment junctions of Giles lysogen and integrated plasmids. Isolated colonies from transformations with Giles integrating vectors and control plasmids, both integrating and extrachromosomal, and a single colony from the Giles lysogen were resuspended in 200 µl sterile distilled H2O and boiled at 95°C for 6 min, and 1 µl was used as a template in PCRs with primer pairs that amplify either a 770-bp attL fragment or an 842-bp attB fragment. The attL product is amplified from the Giles lysogen and from all Giles transformants; the attB product is obtained with control colonies and wild-type mc2155.
|
Several mycobacteriophage genomes (such as L5) (13) contain transcriptional-terminator-like structures near the integration apparatus. Giles also contains a stem-loop terminator-like structure in this region, but it is atypically located within the putative attP site, positioned between the 46-bp common core and the P3/P4 pair of arm-type sites (Fig. 7A). While this is an unusual organization, there may be a strong selection for a terminator immediately to the right of the attL junction sequence in an integrated Giles prophage to prevent expression originating from the chromosomal tRNA gene from extending through to the xis gene (30), as well as the lysis functions. Errant expression of xis in a Giles prophage would likely lead to instability of lysogens, while premature expression of the lysis genes could likely be toxic to a Giles lysogen.
Construction of integration-proficient vectors. To determine whether the Giles integration cassette is functional, we constructed integration-proficient vectors containing the Giles attP site and integrase gene. Plasmids pGH1000A and pGH1000B were constructed that contained a 1.7-kbp attP-int cassette (coordinates 24922 to 26617) inserted in opposite orientations in a hygromycin-resistant nonmycobacterial replicating plasmid vector; all of attP, including the putative arm-type sites, was included, as well as the entire integrase gene and the 28-to-29 intergenic region (where int expression signals may reside), but not all of gene 30, which encodes the putative excise function. When electroporated into M. smegmatis, these plasmids generated transformants with an efficiency of 3 x 105 to 6 x 105 transformants/µg DNA, and analysis of the transformants by PCR showed that all contained the plasmid integrated at the predicted attB site (Fig. 7D); while the transformation frequencies are similar to those with extrachromosomally replicating plasmids, they are modestly lower (approximately 5- to 10-fold) than those with L5 and Bxb1 integration vectors. These vectors thus add to the repertoire of mycobacterial integration-proficient vectors that are compatible and stable and that generate single-copy recombinants (20, 21, 31).
Acquisition of host DNA by illegitimate recombination. Giles gene 79 is located at the right end of the genome but is transcribed leftward, in contrast to the genes to its immediate left. Database searches revealed that the protein product is closely related to parts of bacterial MetE proteins, although the related parts are confined to a small segment in the central part of MetE (Fig. 8A). Interestingly, gene 79 contains a 203-bp segment that is 100% identical to the M. smegmatis metE gene (Msmeg_6638), although the alignment can be modestly extended at the right end by deletion of a single base in the Giles 79 sequence (Fig. 8B). Because the shared sequence has not yet diverged, the acquisition of it by Giles must have occurred quite recently, and thus the junctions between conserved and flanking sequences likely correspond closely to the sites of recombination. Since it is not obvious that this sequence could have been acquired by any targeted process, we conclude that it occurred through illegitimate recombination. Furthermore, we note that there is a segment of low-GC% DNA at the end of the Giles genome, and the transition between high and low GC% appears to correspond closely to the right junction of the conserved sequences (Fig. 8C), consistent with an illegitimate-recombination event at this site. While the acquisition of bacterial genomic sequences has been proposed to occur by such a process (15, 28), this is—to our knowledge—the first instance where an early stage in the process has been captured. In considering which specific host genome gave rise to the Giles sequence, it is likely that it came directly from either M. smegmatis or a very closely related strain. We note, for example, that the next closest bacterial metE sequences are in Mycobacterium avium and Saccharopolyspora erythraea, which have only 85% nucleotide identity to Giles gene 79. It seems unlikely that gp79 has any functional features of MetE, since MetE is a large protein (771 residues in M. smegmatis), and only 66 of the residues are present in gp79. While this segment includes several residues involved in the binding of folate (29), it is not obvious from the three-dimensional structure of MetE that a complete folate binding domain could be formed (29). While gp79 may indeed be nonfunctional, its acquisition from the bacterial chromosome could have occurred to satisfy a required packaging size of the genome. It seems reasonable to assume that this sequence could serve as a substrate for subsequent homologous-recombination events with the chromosome that would lead to creation of a functional gene.
![]() View larger version (23K): [in a new window] |
FIG. 8. Acquisition of host DNA by illegitimate recombination. (A) Giles gene 79 encodes a putative 103-residue protein with sequence similarity to M. smegmatis MetE; residues 20 to 86 of Giles gp79 are identical to residues 493 to 559 of MetE. (B) Nucleotide sequence comparison showed that the central part of Giles gene 79 contains a 203-bp sequence block that is 100% identical to a central part of the M. smegmatis metE gene (Msmeg_6638). It is not known if Giles gp79 is functional, but the common DNA sequence has apparently been acquired from a mycobacterial chromosome (possibly M. smegmatis) by illegitimate recombination. (C) A transition in GC% at the right end of Giles gene 79 likely reflects an illegitimate-recombination event at the right end of the shared nucleotide block. The GC%s of both the M. smegmatis and Giles genomes are 63 to 65%, but the right end of the Giles genome has a substantially lower GC%. The panel corresponds to approximately 1.5 kbp at the extreme right end of the Giles genome. The reduction in GC% occurs at coordinate 54000, and the GC% of the segment between there and the end of the genome is about 50%.
|
|
|
|---|
The pervasive mosaicism among bacteriophage genomes can generally be accounted for by assuming not only that illegitimate recombination has occurred widely throughout viral evolution, but that it represents an especially creative process. A more interesting question that then arises concerns who actually participates in this orgy of illicit genetic exchange. Analysis of the Giles genome suggests that it involves a broad variety of phage genomes, as illustrated by the large number of individual genes that are not related to other known phages, as well as the host chromosome. Such recombination events between phages and their bacterial hosts have been postulated previously, but Giles represents one of only very few examples where the event giving rise to this acquisition can be inferred. Thus, a picture of microbial evolution emerges that is messy, in which genetic variation is generated by illegitimate recombination, homologous recombination, and site-specific recombination and persists within the population either due to direct selection or as the nonadaptive consequence of other pressures, such as viral genome size selection.
A puzzling aspect of the phages of M. smegmatis is the predominance of those that form plaques that are neither obviously turbid nor clear but rather have plaque morphologies that are somewhere in between. The genomes of many of these phages show the presence of integration cassettes, suggesting that they either can form lysogens or are derivatives of temperate parents (12). While Giles has a similar plaque morphology, in this case, it is clear that the phage is competent to form lysogens carrying an integrated prophage, and the low frequency of lysogenic establishment accounts for the barely turbid plaque type. The location of the phage immunity functions remains unclear from bioinformatic analyses and will need to be determined empirically, as was done for mycobacteriophage L5 (7).
A notable feature of mycobacteriophage genomes is that while many contain integration cassettes, the predicted sites for chromosomal integration vary considerably (31). As expected, phages that share considerable similarity typically have similar integration systems, although at least 10 different integration sites have been postulated (31), with Giles choosing yet another site within a tRNAPro gene. Interestingly, this site appears to be used by a yet-uncharacterized prophage that is present in the sequenced mycobacterial genomes MCS and KMS. We also note that the Giles attB site is the closest of any of these to the predicted terminus of DNA replication, which is estimated to be located at approximately 3.41 Mbp in the M. smegmatis mc2155 genome (H. Hendrickson, J. van Kessel, and J. Lawrence, unpublished data). Giles-derived integration-proficient vectors may thus be useful for inserting genes into the terminus-proximal region and using them to study expression and recombination in this part of the genome. It seems likely that many more integration sites will be identified for other mycobacteriophages, and it seems reasonable that all or most of the 47 M. smegmatis tRNA genes may be utilized. This makes the development of integration-proficient vector systems especially attractive, since multiple integration events can be used to construct complex genetic variants (31).
Finally, while mycobacteriophages represent one of the best-studied groups of phages that infect a common bacterial host, the apparently high degree of genetic diversity suggests that the genomic characterization of mycobacteriophages will continue to reveal interesting surprises and insights into phage diversity and evolution. Questions regarding the origins of genomic mosaicism, the number of exchangeable modules, the identities of recombining participants, and the frequency of genetic exchanges remain largely unanswered. However, characterization of the Giles genome suggests that continued phage genomic characterization will throw considerable light upon these questions.
This work was supported by a grant to the University of Pittsburgh from the Howard Hughes Medical Institute (HHMI) in support of Graham Hatfull under HHMI's Professors program and by a grant from NIH to R.W.H. (GM51975).
Published ahead of print on 4 January 2008. ![]()
Supplemental material for this article may be found at http://jb.asm.org/. ![]()
Present address: UG2, School of Biological Sciences, University of East Anglia, Norwich, United Kingdom NR4 7TJ. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»