and
Ry Young1*
Department of Biochemistry and Biophysics, Texas A&M University, College Station, Texas 77843-2128,1 Department of Plant Pathology and Microbiology, Texas A&M University, College Station, Texas 77843-2132,2 Microbiology Department, BIOMERIT Research Centre, National University of Ireland, Cork, Ireland,3 Department of Pediatrics and Communicable Diseases, University of Michigan Medical School, Ann Arbor, Michigan 481094
Received 20 June 2005/ Accepted 5 October 2005
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
If phage mosaicism is limited mostly by physical access, then phages of hosts that occupy widely disparate ecological niches might be in a unique position to undergo mosaic exchange. The Burkholderia cepacia complex (Bcc) consists of heterogeneous members of the beta-Proteobacteria. Bcc members include opportunistic human pathogens, like B. cenocepacia, which account for the majority of infection for persons with cystic fibrosis, and phytopathogens, particularly B. cepacia, the causative agent of onion "sour skin" (5). Members of Bcc can also be recovered from the soil (22), water samples (52), and the rhizosphere of crop plants (12). Bcc isolates are not necessarily specific for one niche. For example, isolates of the B. cenocepacia electrophoretic type PHDC, a significant cause of cystic fibrosis Bcc infections, have been recovered from agricultural soils (34). We have found these soils to be a rich source of Bcc phages as well (C. F. Gonzalez, G. L. Mark, E. Mahenthiralingam, and J. J. LiPuma, Isolation of soilborne genomovar I, III and VII Burkholderia cepacia and lytic phages with intergenomovar host range, Int. B. cepacia Working Group 6th Annu. Meet., p. 115-117, 2001). In contrast with the distribution of phage morphotypes in the literature, our isolates are heavily biased towards myophages. Here, we describe the genomic organization of three Bcc-specific phages (Bcep phages) isolated from soil samples at disparate locations and times. The organization and relationships of these Bcep phages are discussed in respect to current models of phage genome evolution.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Phage isolation and imaging.
For enrichment, 2 g soil was incubated for 30 min at room temperature with shaking in 20 ml of 0.1% peptone broth. After settling, the top 10 ml was removed, clarified twice by low-speed centrifugation, and filtered through a 0.2-µm filter to generate a cell-free phage suspension. Phages were isolated by inoculation of a 25-ml logarithmic (A550 of
0.4) Bcc culture with 1 ml of phage suspension and incubation for 20 m at room temperature without shaking, followed by overnight incubation with shaking (200 rpm) at 28°C. The culture was cleared of cells by centrifugation and filtration as above, generating a phage lysate. The titers of phage lysates were then determined with the host used for enrichment, and individual plaques were isolated. Pure phage stocks were obtained by amplification from a single plaque, followed by reisolation from a single plaque and reamplification. Preparation of high-titer lysates, determination of Bcep phage titers, and imaging by transmission electron microscopy were done as described previously (49).
Phage Bcep781 was isolated by S. Beer (Cornell) in 1978 from Orange County, NY, muck soils as a plaque former on Bcc strain 74-34, an onion pathogen provided by J. Lorbeer (Cornell) (20). Phage Bcep43 and its original host, Bcc43, were isolated from muck soil of Orange County, NY, obtained in 1999. Phage Bcep1 and its original host Bcc strain, HI2424 (34), were isolated from Oswego County, NY, soils in 1999. Phage BcepB1A and its original host, S198B1A, were isolated from soils obtained at a different site in Oswego County, NY, in 2000.
Phage infection parameters.
The eclipse period and burst size for bacteriophage Bcep781 were determined by conducting a one-step growth experiment, as described previously (49). The kinetics of phage adsorption was determined by infecting a logarithmic B. cepacia 74-34 strain, in the presence or absence of 0.01 M MgSO4 or 0.01 M CaCl2, at a multiplicity of infection of
103. Samples were taken at 5-min intervals, and titers were determined after removal of cells by filtration through a 0.2-µm filter (Nalgene). The rate of phage particle disappearance is defined as dP/dt = kBP, where B is the concentration of bacteria, P is the concentration of free phage at any time (t), and k is the adsorption constant in ml cell1 s1 (48).
Genomic analysis. Library preparation, shotgun sequencing, sequence assembly, and analysis were done essentially as described previously (49). The program Sequencher (Gene Codes Corporation) was used for sequence assembly from contigs. Areas of low-quality sequence were resequenced using primer walking. Protein coding regions were predicted initially using GeneMark.hmm (http://opal.biology.gatech.edu/GeneMark/) (2). Predicted coding regions were refined with Artemis (http://www.sanger.ac.uk/Software/Artemis/) (47). The predicted proteins were then compared to the NCBI protein database with BLASTP at the mirror site located at XBLAST (http://xBLAST.tamu.edu/pise/). Structural features (transmembrane helices and predicted molecular weights) of the proteins were determined with proteomic tools at ExPASy (http://us.expasy.org) (16). tRNAs were identified with the tRNAscan-SE search server (http://www.genetics.wustl.edu/eddy/tRNAscan-SE/). DNA pairwise comparisons were performed with the program Base by Base at the SARS Bioinformatics Suite (http://athena.bioc.uvic.ca/sars/index.php?page = tools). Phage genome maps were drawn utilizing the program DNA Master (http://cobamide2.bio.pitt.edu/computer.htm).
IST library construction and analysis. An interactive sequence tag (IST) library of phage Bcep781 was produced and analyzed as previously described for BcepMu (49). Protocols can be found at http://oligomers.tamu.edu/doodle (41).
Bcep781 genome end cloning. To clone the Bcep781 genomic end fragments, phosphorylated XbaI linkers (New England Biolabs) were ligated onto Bcep781 genomic DNA. The genomic DNA/linker ligation reaction product was digested to completion with XbaI and XhoI and ligated into XbaI/XhoI-digested pBluescript II SK (Stratagene). Transformants were picked at random and grown overnight in deep-well plates, with shaking (270 rpm). Plasmid DNA was isolated as described above and sequenced with the T3 primer (Stratagene). The positions of 47 independent end clones were determined with Sequencher (Gene Codes).
Amino-terminal sequencing of Bcep781 structural proteins. Phage lysate proteins were separated by sodium dodecyl sulfate-polyacrylamide gel electrophoresis and electrotransferred onto a polyvinylidene difluoride membrane. After the membrane was stained with Coomassie blue, the two predominant bands of 17 kDa and 33 kDa were excised from the blot and subjected to automated protein sequencing in the Texas A&M University Protein Chemistry Laboratory.
Cloning and assay of the Bcep43 endolysin.
To test for endolysin function, Bcep43 gene 27 was cloned into the expression vector pGemT-easy (Promega). (Bcep43 gp27 is identical in predicted amino acid sequence with gp27 of Bcep781 [see Table S2 in the supplemental material]). First, the coding sequence of gene 27 was amplified using Pfu polymerase and primers endo1 (ATAGGATCCCAGGAGGCCTGTAACATGGC) and 2endo2 (TCGGGCATTGTGTCAAGCTT). Following the manufacturer's guidelines, the resulting product was A-tailed with Taq polymerase, ligated into pGemT-easy, and transformed into E. coli JM109 cells. An insert with the correct orientation with respect to the T7 promoter in pGemT-easy was identified, designated pGemT-27, and transformed into XL1-Blue electrocompetent cells (Stratagene). For the cell lysis assay, overnight cultures containing pGemT-easy or pGemT-27 were diluted 250-fold into LB-ampicillin (ampicillin, 100 µg/ml) and aerated at 37°C. The cultures were induced at an A550 of
0.2 to 0.3 with 1 mM IPTG (isopropyl-ß-D-thiogalactopyranoside), and the culture density was monitored at 10-min intervals for 1 h. CHCl3 was then added to 1% final concentration, and the A550 was determined at 5 and 10 min after addition.
Nucleotide sequence accession numbers. The sequences of Bcep781, Bcep43, Bcep1, and BcepB1A have been entered as GenBank accession numbers AF543311 AY368235, AY369265 and NC_005886, respectively.
| RESULTS |
|---|
|
|
|---|
All four phages were found to have similar DNA sizes, of about 48 kb, based on pulsed-field gel electrophoresis (PFGE) (not shown). Transmission electron microscopy images revealed that all four phages had myophage morphologies, with collars, short appendages extruding from the baseplate, and isometric capsids decorated with knobs at the icosahedral vertices (Fig. 1). Bcep43 and Bcep781 plate efficiently on B. cepacia strains 74-34 and Bcc43, whereas phages Bcep1 and BcepB1A were restricted to the single isolates of B. cenocepacia Bcc1 and s198B1A, respectively. All four phages formed clear plaques on all susceptible hosts, suggesting that they were virulent; bioinformatics analysis of their genomes did not reveal any genes involved in lysogenization (see below).
|
PaPa, the commonly used variant of phage
to E. coli K-12 (24). Addition of 0.01 M MgSO4 or 0.01 M CaCl2 had no effect on the adsorption rate. At 28°C, the average Bcep781 burst size was found to be 180 PFU/cell, with a latent period of
150 min (Fig. 2B).
|
The degree of circular permutation of Bcep781 was analyzed in more detail. Restriction maps of Bcep781 with several restriction enzymes with multiple recognition sites gave the pattern expected from a covalently closed circular molecule. However, the digests did not contain a submolar fragment that would indicate the presence of a pac site (57). In addition, digestion of Bcep781 genomic DNA independently with NdeI and NheI, with single cleavage sites 7.8 kb apart, resulted in a single band which resolved into a smear by PFGE (Fig. 3A). The close correlation in length as determined by PFGE (48.5 kb [Fig. 3A]) and the sequence length (48.2 kb) indicate the packaged DNA possesses little terminal redundancy. Bcep781 genomic DNA formed ladders following treatment with T4 DNA ligase (Fig. 3A), suggesting that the termini are not modified by proteins or dephosphorylated. A library enriched in phage genomic terminus clones was constructed by adding a linker with a unique restriction site (XbaI) to the end of purified genomic DNA and digesting the product with XbaI and an enzyme with multiple restriction sites (XhoI; 27 sites). The positions of 47 clones possessing XbaI and XhoI sites were mapped. The largest gap in the end clone library correlated to the largest gap in XhoI sites, which would produce fragments of a size likely to be underrepresented (Fig. 3B). The positions of the end clones were uniformly scattered throughout the length of the genome, which, along with the restriction digest analysis, indicates that Bcep781 has a highly circularly permuted genome similar to that of the classic coliphage T2 (50). These results are inconsistent with a packaging mechanism involving initial cleavage at or near a pac site, followed by subsequent rounds of headful packaging. However, the findings do not discriminate between random initiation, initiation at multiple pac sites, or terminase recognition of pac followed by movement and cleavage at distant sites (7, 33).
|
|
|
|
(i) DNA metabolism. Most of the Bcep781 genes encoding proteins with robust functional homologues were involved in DNA metabolism. These included a DNA methyltransferase (gp10), a Holliday junction resolvase (gp22), a helicase (gp60), and a DNA polymerase (Pol) I homologue (gp66). Phage genomes typically show clustering of related genes. Replication and recombination genes are generally encoded in early transcriptional units, whereas morphogenesis and lysis genes are expressed in late transcripts. In terms of functional groupings, five Bcep781 genes showed similarity to or had motifs found in genes involved with DNA metabolism. Three of these genes are located within the two top-strand transcriptional blocks; no strong prediction of any morphogenesis genes is found in these blocks (but see "Lysis" below), suggesting that they are early transcriptional units and that the low GC regions have early, rightward promoters.
Bcep781 gene 22 encodes a homologue of RusA, a Holliday junction resolvase found in coliphage 82 and other lambdoid phages (37). Holliday junction resolvases are endonucleases that process the intermediate structure formed during homologous recombination events. The analogous but unrelated T4 gene product is gp49, which is responsible for cleaving branches prior to DNA packaging (13) and is functionally part of the packaging machinery (19). Bcep781 gene 58 encodes a homologue of the phage T4 DNA helicase, UvsW (Dar protein) (6). Like UvsW, Bcep781 gp58 contains imperfect ATP/GTP binding site and DEAH box ATP-dependent helicase signature motifs. Bcep781 gp62 shows significant homology to Bacillus subtilis phage SPO2 DNA-directed DNA polymerase gpL (44), and similar genes are found in phage APSE-1 (gene 45) and in numerous putative prophages (51). These phage and prophage DNA Pol I homologues are only weakly similar to authentic bacterial DNA Pol I subunits, primarily limited to the region around the DNA polymerase A signature domain.
The final two DNA metabolism gene assignments are more problematic. In the first transcriptional block, Bcep781 gene 10 encodes a weak homologue of Caulobacter crescentus and Agrobacterium tumefaciens cell cycle-regulated DNA adenine methyltransferase, CcrM, which is involved in methylation of DNA to effect recruitment of the replication complex (28). Significant homologues of CcrM are found in numerous bacteria and in archaea. There are distant homologues in phage genomes, including the mox gene of the myxococcal phage Mx8, which has been shown to encode DNA adenine methyltransferase activity but which had no nonsense phenotype in either lytic or lysogenic growth (36), leaving the role of this gene in phage DNA metabolism unknown. Although the Bcep1 gene 10 is also a homologue of CcrM, it is located in a cluster of three genes with no DNA sequence similarity with Bcep781 and is thus likely to have been acquired laterally.
As noted above, Bcep1 gene 16 encodes a highly truncated homologue, only 42 residues, of the bacterial DNA polymerase III ß-clamp subunit. The Bcep781 and Bcep43 homologues are longer, at 193 residues, but still significantly smaller than the typical length (>300 residues) of authentic bacterial homologues. ß-Clamp subunit homologues are not typically encoded by non-T4-like phages. In this phage group, Bcep781 gp15 appears to be a moron as it is located on the opposite strand within the head assembly gene cluster (Fig. 4). Pseudogenes are usually not detected in phage (31). Gene 43 (Bcep1 44, Bcep43 42), immediately downstream of the putative tape measure protein gene, is different in all three phages. Its size (60 to 88 residues) and the fact that it maintains its upstream gap with respect to the tape measure gene (67 bp in Bcep1 and Bcep43 and 68 bp in Bcep781) and the overlap of its stop codon with the downstream gene suggest that the three variants arose by different deletion events from the same original cistron.
(ii) Morphogenesis. Even though individual similarities are low, Bcep781 gene 31 to gene 51 are likely to encode proteins involved in tail, baseplate, and tail fiber assembly (Table 1). Out of 20 predicted genes in this region, only 6 encode proteins with homologues outside of the Bcep781 group or the related prophage of Photorhabdus. However, these show weak or indirect similarity to tail and tail fiber structural proteins. The amino-terminal third of Bcep781 gp31 and Bcep1 gp33 are related both to each other and to the amino-terminal domain of a Shigella flexneri prophage tail fiber protein. In turn, the prophage gene is related to the P2 gpH tail fiber homologue over the C-terminal part of the tail fiber protein. Moreover, the C-terminal domain of the Bcep1 protein shows significant homology with the phage GMSE-1 tail fiber, also a P2 gpH homologue. In contrast, the carboxy terminus of Bcep781 (and Bcep43) gp31 exhibits sequence relationship to bacterial S-layer proteins and vertebrate mucins. PSI-BLAST results suggest that Bcep781 gp33 is related to Mu gp47, a probable homologue of P2 W, the base plate wedge (23). PSI-BLAST analysis also suggests that Bcep781 gp36 is related to P2 gpV and Mu gp45, a baseplate assembly protein. Based on indirect homology and length, Bcep781 gp44 is a candidate for the tape measure protein. Bcep781 gp44 shows weak similarity to Nocardia cryptic prophage protein, which in turn is related to the P2 T tape measure protein (10). While these homologies are weak, the additional compelling evidence for the annotation of Bcep781 genes 31, 33, 36, and 44 as functional homologues of P2 H, W, V, and T includes the similar gene order and sizes of the tail fiber and baseplate encoding genes of P2, Mu, and the Bcep781-like myophages. Our assignment of Bcep781 gene 44 as the tape measure protein gene suggests that the gene immediately preceding it would be predicted to encode the frameshift proteins, EE', involved in tail assembly (10). A recent study identified such frameshift proteins in 49 out of 68 phages and prophages but, in the absence of an obvious candidate for the tape measure protein, did not detect a potential G/GT (the lambda equivalent of P2 EE') equivalent in Bcep781 (58). A manual search of the coding region of Bcep781 gene 45 identified the "slippery sequence" GGCAAAC, which serves as the 1 frameshift motif that generates the alternate C terminus in the G/GT genes of Yersinia lambda (58). However, although this motif is conserved in Bcep43, it has a single base pair change in Bcep1, to GGCGAAC (with the change underlined), which should prevent the frameshift step. Since the frameshift is essential for tail morphogenesis, either this sequence is not the frameshift-inducing element in Bcep781/43 or Bcep1 has an alternative mechanism.
The most promoter-proximal gene in the tail morphogenesis transcription unit to which a function could be assigned is gp49, a tail spike protein. Particles formed by Bcep781 phage possessing point mutations in gp49 lack tail spikes in electron microscopy images (M. D. King, unpublished data). Bcep781 gp49 and Bcep43 gp48 are 99% identical but are only 27% identical to Bcep1 gp51. Moreover, more than a third of the amino acid residues in these proteins are glycine, serine, or threonine. The compositional bias of these proteins probably accounts for the significant homology to numerous bacterial and metazoan extracellular proteins, including mucins and S-layer proteins.
Capsid assembly and the terminase large subunit genes are found in the first leftward transcription unit, genes 19 to 13. Bcep781 gene 18 encodes a homologue of the phage terminase large subunit, TerL, from phages TP901-1 and Aa
23. Like Bcep781, these phages use headful packaging mechanisms (4). Submolar fragments were detected in digests of TP901-1 and the TP901-1 pac site mapped upstream of terS (56). When TerL subunits from 114 phages were aligned, the clustering corresponded well to the structures of the ends of the packaged DNA (8). Thus, TerL subunits generating 5'-extended cohesive ends and 3'-extended cohesive ends fell into distinct groups. The TP901-1 and Aa
23 TerL subunits, to which Bcep781 gp18 is related, formed a separate, poorly supported, and deeply branching group distinct from other TerL homologues (8). TerL typically provides the ATPase and DNA cleavage activity for the DNA packaging pathway, while the cognate terminase small subunit, TerS, is responsible for sequence recognition. No TerS homologue was identified in Bcep781, but TerS proteins are typically less conserved than TerL. Bcep781 gene 19 is likely to encode TerS because it is immediately upstream of TerL and the size of its predicted product,
18 kDa, is typical of TerS homologues. However, gp19 had no significant similarity to proteins in the database outside of the Bcep781 group.
Several lines of evidence indicate that the next two genes in this cassette, Bcep781 gp16 and gp17, encode minor head proteins. Bcep781 gp16 is a member of a cluster of orthologues (39), COG2369, which also contains the minor Mu head protein, gp30, and its homologues. Moreover, gp16 and gp17 are homologues of phage Aa
23 proteins p32 and p31, respectively (45). Aa
23p32 possesses an amino-terminal conserved domain (pfam04233.6) which includes SPP1 G7P. SPP1 G7P is the most well studied protein among members possessing this domain and has been shown to be present in low copy number in SPP1 phage heads (1). Homologues of Bcep781 gp16 and gp17 appear to segregate together, as out of 20 database homologues of gp16 with E values of <0.5, 13 are encoded by genes immediately adjacent to homologues of gp17 (identified with E values of <0.5) (Table S1 in the supplemental material). No other gene pair in Bcep781 shows such tight linkage of homologues. This suggests a functional interaction between the two proteins. The first step in double-stranded DNA head morphogenesis is assembly of a scaffold for the capsid protein subunits. The capsid and/or scaffolding proteins are then frequently processed by a prohead protease. Using a combination of computational and manual strategies, Cheng et al. identified Bcep1 gp15 (the Bcep781 gp14 orthologue) as one of 199 head maturation proteases in phage, prophage, and herpes viruses (9). Bcep1 gp15 was one of 17 sequences, of which 16 were from prophages and only 1 was from a phage, Aa
23, forming the orthologue cluster, COG3566, which possesses conserved catalytic His and Ser residues and predicted secondary structure, despite great overall primary sequence distance. This annotation is particularly compelling in view of the location of this gene, immediately preceding the gene encoding the Dec (decoration or head stability protein [18]; see below). In many phages, there is conserved order of genes encoding essential capsid morphogenesis domains: capsid protease, scaffold, head stability (Dec), and major capsid proteins (9). In lambda, the protease and scaffold reading frames are fused, so that the gene for the scaffold Nu3 is constituted by a secondary downstream start codon in the protease gene C (25, 55). Accordingly, inspection of Bcep781 gene 14 revealed a consensus Shine-Dalgarno sequence (GGAGA) positioned 12 bases upstream of the AUG codon 199 and thus could serve as the site of initiation for the scaffolding protein gp14'.
The two predominant bands of 33 kDa and 17 kDa observed when purified Bcep781 phage particles were analyzed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (not shown) were subjected to N-terminal sequencing. The larger species had an N-terminal sequence of AADLS, corresponding to residues 35 to 39 of gp12, indicating that gp12 is proteolytically processed after residue 34. Moreover, although there was no significant homology with any phage major capsid protein, the predicted secondary structure of gp12, determined by use of the PredictProtein software suite, is very similar to that of a canonical major capsid protein, lambda E (not shown). This is consistent with the notion that the basic folds of major capsid proteins of phages and viruses with icosahedral-symmetry capsids are similar, despite the absence of sequence homology (14). In addition, gene 12 is perfectly positioned downstream of a strong Shine-Dalgarno sequence and is flanked by two very strong GC-rich stem-loop sequences; similar structures flank the lambda E gene and presumably facilitate the efficient translation required to produce high levels of the major capsid protein, relative to other cistrons on the late mRNA. The 17-kDa protein gave an N-terminal sequence of PFQKQVY, corresponding to residues 2 to 8 of gp13, which had a predicted molecular mass of 17.1 kDa. Given that the two proteins appear to be present in equimolar ratios (data not shown), it is possible that they represent the major capsid protein and the Dec protein, respectively. It has been found that with closely related pairs of phages, such as L and P22 or T2 and T4, one member may have a decorator protein while the other does not, suggesting an accessory role under certain conditions (18). Thus, despite the extremely limited primary sequence homology, the identified Bcep781-like phage DNA packaging and head assembly cassette gene order is as follows: terS (highly putative, implied by position only), terL, a minor head protein (possibly the portal gene) homologue, a Mu gene 30 homologue, the prohead protease gene with embedded scaffolding protein gene, decorator protein, and the major capsid protein.
(iii) Lysis. Bcep781 gene 27 encodes a homologue of Pseudomonas aeruginosa phage phiCTX gp12, which was annotated as a possible lytic endolysin based on the presence of a peptidoglycan-binding motif (42). Bcep781 gp27 lacks a recognizable peptidoglycan-binding domain and has no primary sequence homology to known endolysins. To test for cell wall-degrading activity, gene 27 was cloned into an inducible expression vector. Cells expressing gene 27 rapidly lysed after membrane permeabilization with chloroform, while cells carrying the vector did not (Fig. 6A), demonstrating that gp27 is the authentic endolysin (60).
|
Two additional genes present in the lysis cassette of gram-negative hosts are the nested genes Rz and Rz1, originally identified for phage
as required for host lysis in the presence of cation concentrations that stabilize the outer membrane (61, 62). Rz encodes a secretory protein with a signal peptidase I leader peptide and has been proposed to encode an endopeptidase activity. The Rz1 gene is embedded in the +1 reading frame compared to Rz and encodes a short Pro-rich lipoprotein that has been localized to the outer membrane (29). A manual search of all Bcep781 genes predicted to contain a signal peptide revealed that gene 24 not only encodes a secretory protein of approximately the same size as
Rz but also has an embedded reading frame, designated gene 25, that is served by a strong Shine-Dalgarno sequence (GGAGA) and encodes a predicted lipoprotein (Fig. 6B). Despite the lack of sequence similarity with the equivalent genes in other phages of gram-negative bacteria, we conclude that genes 24 and 25 are the Rz and Rz1 homologues in Bcep781.
The Bcep781 phage group lysis genes have an atypical organization. Unlike typical lysis cassettes as exemplified with lambdoid phages, in Bcep781 the Rz and Rz1 genes precede the endolysin gene and are separated from it by a gene of unknown function. Moreover, the holin gene is encoded on the opposite strand at the end with the tail assembly cassette (Fig. 6B).
IST analysis.
Overall, a minimal number of Bcep781 group genes could be assigned a function based solely on primary structure homology. To reinforce the identification of genes, an IST library was constructed. These libraries are based on the ability of expressed fusion proteins consisting of the N-terminal DNA-binding domain of the
CI repressor and sequences encoded by randomly cloned fragments from the target genome to reconstitute repressor function (40). As CI requires separate DNA-binding and dimerization domains, immunity is conferred when the fragment of target DNA is in frame with CI and encodes a stretch of amino acid residues capable of homotypic interactions. A total of 77 immunity-conferring clones were isolated and sequenced. These were found to be in frame with 10 annotated Bcep781 genes (Tables 1 and 2). gp31, the putative P2 H homologue, had the most representatives in the IST library (26 representatives, corresponding to three domains). A similar result was found with the BcepMu IST library, where the P2 H homologue, BcepMu52, was the most abundant IST isolated (49). Hypothetical novel proteins Bcep781gp28 and gp46 were also represented multiple times in the library. Interestingly, Bcep781 hypothetical novel protein gp19, which is similar in size and genome location but not sequence homology to TerS subunits, was found in the IST library, just as BcepMu TerS was found in the BcepMu IST library. The
TerS equivalent, Nu1, has been shown to form stable dimers (38). When the IST sequences were compared to the Prosite database (17), the most significant homology to known multimerization domains found was present in IST gp26, which exhibited 92% similarity with a myc-type, "helix-loop-helix" dimerization domain structure. The remaining ISTs did not exhibit similarity of over 75% to known motifs (data not shown).
|
|
Despite the high degree of identity, the complement of encoded proteins was not the same for the three phages. In some cases, there appear to be mosaic orthologues, as in the case of Bcep1 gp10, which is closely related at a protein level but not a DNA level to Bcep781 and Bcep43 gp10. At least nine genes between Bcep781 and Bcep1 and four genes between Bcep781 and Bcep43 appear to be the result of either extensive deletions or insertions or the acquisition of a distant yet still related homologue (Tables S2 and S3 in the supplemental material). Both Bcep781 and Bcep1 encode proteins not present in the other two phages. Five of the 71 predicted proteins of Bcep1 are unique, and 3 of these (gp12, gp63, and gp71) have identifiable database homologues (Table 1).
Relationship to BcepB1A and a P. luminescens prophage element. At a protein level, two phages that share a significant number of genes with the Bcep781-like phages were identified. One is a prophage element present in the P. luminescens TT01 genomic sequence, consisting of 19 out of 41 genes (from plu3381 to plu3422) that are largely syntenic (albeit circularly permuted) with Bcep781 (Table 1 and Fig. 5) (11). These include the Dec and major capsid proteins, which have no other homologues in the database, and part of the tail and tail fiber cassette. Otherwise, this prophage element encodes lambdoid lysis proteins, terminase subunits, and tail fiber homologues. Thus, it appears to be a mosaic consisting of the structural genes of a Bcep781-like myophage and a temperate lambdoid siphophage. The Bcep781-related prophage element is about 30 kb and is immediately adjacent to another prophage, PhotoMu (a Mu-like prophage closely related to Burkholderia phage BcepMu), that extends from gi36786729 to gi36786769 (49).
The other phage related to the Bcep781-like phages at a predicted protein level is BcepB1A. Out of 72 predicted BcepB1A-encoded proteins, 14 show significant similarity to Bcep1 proteins and 12 to proteins encoded by the P. luminescens prophage described above (Table 1 and Fig. 5). These include numerous genes that lack appreciable homologues in the database. Overall, BcepB1A has a quite distinct genome arrangement from that of Bcep781-like phages. Although it is also circularly permuted and has one major divergent promoter region, BcepB1A has most of its genes on one strand. The BcepB1A endolysin is a "true lysozyme," a homologue of the well-studied coliphage T4 lysozyme, but, like the endolysins R21 from lambdoid phage 21 and Lyz from coliphage P1, also has the additional feature of an N-terminal SAR domain, shown to direct holin-independent protein secretion (59). Despite disparate hosts, the Photorhabdus prophage is more similar to Bcep781 than Bcep781 is to BcepB1A (Fig. 5). BcepB1A is also mosaic to a lesser extent with other Burkholderia phages Bcep22 and BcepC6B (GenBank accession numbers NC_005262 and NC_005887). Although isolated using the same enrichment technique, these are quite distinct from the Bcep781-like phages. Bcep22 and BcepC6B are podophages with genome sizes of 64 kb and 42 kb, respectively (unpublished). BcepC6B is a temperate phage, mosaic to Bordetella podophage BPP-1 (35).
| DISCUSSION |
|---|
|
|
|---|
Because it is so widespread, lateral gene transfer is obviously advantageous to phage. What then is the contribution of divergence due to positive selection for random mutations relative to this mosaicism? The high degree of DNA sequence identity exhibited by Bcep781, Bcep43, and Bcep1 (87% to 99%) made it possible to generate values for synonymous and nonsynonymous nucleotide substitutions for the majority of the genes. These data can be interpreted in terms of the selective pressures on the phages. If it is assumed that 25% of random substitutions result in synonymous changes and 75% in nonsynonymous substitutions (32), then the number of synonymous substitutions per potential synonymous substitution site (Ks) and the number of nonsynonymous substitutions per nonsynonymous site (Kns) can be estimated. For Bcep781 compared to Bcep43, the values over all nonmosaic protein coding genes are Ks = 2.89 and Kns = 0.3. Thus, the Kns/Ks ratio is 0.103. For Bcep781 compared to Bcep1, the values for Kns and Ks are 2.62 and 0.3, respectively, making the Kns/Ks ratio 0.113. When Kns is less than Ks (Kns/Ks < 1), the simplest interpretation is that the selection pressure is purifying, i.e., natural selection is acting to decrease the frequency of deleterious alleles (32). These results were remarkably uniform across all genes determined to be nonmosaic (Tables S2 and S3 in the supplemental material). Because of the high sequence identity between Bcep43 and Bcep781, when values are assessed on an individual gene basis, only 22 of the 59 aligned genes possessed enough nucleotide differences to perform the analysis (Table S2 in the supplemental material). Of these genes, only one, gene 11, showed Kns/Ks to be >1. As Bcep781 and Bcep1 are more distant, more genes could be analyzed on an individual basis when Bcep781 was compared to Bcep1. Out of 52 aligned coding regions, 44 had enough nucleotide differences to perform the calculation (Table S3 in the supplemental material). Again, only one, Bcep781 gene 28 (compared to Bcep1 gene 29), showed Kns/Ks to be >1. The simplest interpretation of this observation is that for 32,150 bases of aligned Bcep1 and Bcep781 DNA sequence (corresponding to 66% of the genome), the overwhelming majority of nucleotide differences observed reflect an evolutionary path for purifying selection against, rather than positive selection for, amino acid changes.
A similar bias towards neutral genetic drift was found with a comparison of lambdoid phages Sf6 and HK620 (7). Sf6 and HK620 exhibit greater than 83% nucleotide identity over 42.9% of their genomic sequence. These regions were distributed across 20 homology regions encoding 24 proteins. Of these proteins, only one that possessed more nonsynonymous substitutions per nonsynonymous site than synonymous substitutions per synonymous site was identified.
Despite this high identity, the phages exhibit some mosaicism in relation to one another. Similarly to what was observed with Sf6 and HK620, mosaicism was not limited to the acquisition of completely unrelated sequences but also applied to the acquisition of close homologues of the same gene (7). This type of mosaicism is not obvious at a protein sequence level, and thus the degree of mosaicism among phages is probably underestimated. An example of this is Bcep1 gp10, which is 40% identical at an amino acid level to Bcep781 gp10, despite there being no DNA sequence homology. Among the Bcep781 group of phages, therefore, it appears that mosaicism is a dominant mechanism for protein sequence level changes. Given the immensity and diversity of the phage population, it is likely that optimized genes are already available for most conditions. One interpretation of these data is that for proteins under selection to change, adaptive mosaicism is more successful than selection for adaptive divergence among phages.
| ACKNOWLEDGMENTS |
|---|
The assistance of Jim Hu and Leonardo Mariño-Ramirez in generating the IST library was essential. We thank Chris Upton, Rachel Roper, and Vasily Tcherepanov for access and help with the SARS Bioinformatics Suite programs. We are grateful for sequencing and robotics facilities provided to this program through the cooperation of John Mullet and Eun G. No (Center for Plant Genomics and Biotechnology) and Robert Klein (Southern Plains Agricultural Research Center, USDA-ARS). Electron microscope imaging was done by Cristos Savva at the Microscopy and Imaging Center of the Department of Biology at Texas A&M University.
| FOOTNOTES |
|---|
Supplemental material for this article may be found at http://jb.asm.org/. ![]()
Present address: Department of Plant Pathology, University of Nebraska, Lincoln, NE 68583. ![]()
| REFERENCES |
|---|
|
|
|---|
23 of Actinobacillus actinomycetemcomitans. J. Bacteriol. 186:5523-5528.