Previous Article | Next Article ![]()
Journal of Bacteriology, February 2005, p. 1055-1066, Vol. 187, No. 3
0021-9193/05/$08.00+0 doi:10.1128/JB.187.3.1055-1066.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
X. Manival,1,
C. Desplats,1,
and
H. M. Krisch1*
Laboratoire de Microbiologie et Génétique Moléculaire du CNRS, UMR 5100, Toulouse, France1
Received 7 July 2004/ Accepted 18 October 2004
| ABSTRACT |
|---|
|
|
|---|
50-amino-acid N-terminal domain is the only highly conserved segment of the protein. This sequence conservation is probably a direct consequence of the domain's strong and specific interactions with the neck proteins. The sequence of the central fibrous region of gpwac is highly plastic, with only the heptad periodicity of the coiled-coil structure being conserved. In the various gpwac sequences, the small C-terminal domain essential for initiation of the folding of T4 gpwac is replaced by unrelated sequences of unknown origin. When a distant T4-type phage has a novel C-terminal gpwac sequence, the phage's gp36 sequence that is located at the knee joint of the LTF invariably has a novel domain in its C terminus as well. The covariance of these two sequences is compatible with genetic data suggesting that the C termini of gpwac and gp36 engage in a protein-protein interaction that controls phage infectivity. These results add to the limited evidence for domain swapping in the evolution of phage structural proteins. | INTRODUCTION |
|---|
|
|
|---|
-helical structure that is flanked by small N- and C-terminal globular domains (14, 28, 33). It is the N-terminal 50-aa domain of T4 gpwac that binds to the neck of the phage, whereas the 30-aa C-terminal domain of the protein (the "foldon") is essential for the initiation of trimerization and the correct folding of the polypeptide chains in the fiber (14, 23). The nucleotide sequences of the wac genes from a few T-even phages very closely related to T4 have been determined previously (M. M. Schneider et al., unpublished data; our unpublished results), but the amino acid sequences of all of these proteins are virtually identical to that of T4 gpwac. Recently, a phylogeny for the T4-type phages has been established that includes a number of much more distant phages (18, 27, 38). Four subgroups of T4-type phages can be distinguished (18, 38): the T-evens (very closely related to T4), the PseudoT-evens (morphologically very similar to T4 but phylogenetically distant from it) and SchizoT-evens (having limited but clear morphological differences from T4 and phylogenetically quite distant from it). The fourth subgroup, the ExoT-evens, are phages of marine cyanobacteria that have a virion morphology that differs considerably from T4 (isometric heads and longer contractile tail structures); nevertheless, many of the ExoT-even genes are homologs of T4 genes (18). Surprisingly, in view of the modular shuffling model for phage genomes, initial genomic sequencing (12, 18; unpublished results) revealed a homogeneous level of protein sequence divergence among the various "core" genes within the members of the T4 phage family. Thus, a phylogeny of the T4-type phages based on the major capsid protein reflects reasonably well the relatedness of almost all of their structural proteins and many of the nonstructural proteins. We have now determined the wac gene sequence from both PseudoT-even and SchizoT-even phages. The N-terminal globular domain that interacts specifically with the neck structure is largely conserved while the central domain diverges considerably in amino acid sequence, retaining only the repeated heptad motifs of the fibrous structure. In contrast to our expectations, however, the C-terminal domain, which is essential for efficient initiation of T4 gpwac folding, was frequently subjected to modular replacement. All of the distant wac gene sequences had different and unrelated C-terminal domains. The possible role of the modular swapping of this domain is discussed here.
| MATERIALS AND METHODS |
|---|
|
|
|---|
PCRs, oligonucleotide primers, and sequencing. PCR amplification of the DNAs was performed with the Long Expand kit (Roche, Penzberg, Germany) according to the instructions provided by the manufacturer. When inosine-containing primers were used, the PCR amplifications were done with HotTub polymerase (Amersham). For degenerate primers, a 50°C hybridization temperature was used. For specific primers the hybridization temperatures were chosen to match the lowest Tm of the two primers used.
Sequencing was done on the Beckman Seq2000XL automated sequencer by using Beckman DTCS standard chemistry kits according to the supplier's recommendations.
The primers used to amplify the fragments between gene 10 and the previously determined sequence (accession number Z78092) upstream of the gene 13 in bacteriophage RB49 were as follows: G10dir (5'-TCACAATTCGCTGGTTTAATAATGAT-3') and 13Wrev (5'-TTATACCGGAGAAATCCCAAAGAGG-3'). The "consensus" inosine-containing primers designed for amplification of gene 10 to gene 13 fragments from various PseudoT-even and SchizoT-even phages were AL10d (5'-CCTICACAIACCGCAGGTGG-3') and AL13r (5'-AGAAAGTCIGTIAACCACGGATA-3'). With these primers fragments of 4.5 kb were obtained for phages RB42 and RB43, as well as for T4 and RB49 used as a control. For some other distant T4-type phages, no amplification was obtained. Thus, to obtain the sequences in the region from gene 10 to gene 13 of phage 42, we had to use a more complicated approach. We had previously obtained a number of sequences of random PCR fragments of the genome phage 42 (details are available from the authors). We selected sequences flanking the region from gene 11 to gene 13 and used these to synthesize the primers 42-11d (5'-TTTACTAGTGTCGAAGCTCC-3') and 42-13R (5'-TCTAGGGACCGCTGTATA-3').
The "universal" primers used in the attempt to amplify the same region in PseudoT-evens and SchizoT-evens, as well as several different T-even phages (see Results), were Tev-10d (5'-TTTGTCG(T/C)TGGATAAGG(A/G)T(A/T)G-3') and Tev-13r (5'-CTAGGGCACGCTGNATACAATC(G/A)TA-3').
To amplify the fragment from phage Aeh1 from gene 10 to gene 18, we used the primers A10d (5'-AATTCGATCCAGACGATCAG-3') and A18R (5'-GATATAGAGCAGAGGAGTTAC-3'). These primers had Aeh1 gene10 and gene18 sequences obtained by a strategy (see below) involving sequencing of random genome fragments. The PCR amplification of the Aeh1 genome using these primers yields a fragment >14 kb, so we digested this large fragment with HindIII and ligated the digest to the DNA of the pBluescript SK(+) vector digested with the same enzyme. We then performed a PCR with this ligation as a template and A10d and standard M13 reverse as the primers. We obtained an 8-kb fragment that was sequenced from the M13 reverse primer. From the sequence obtained, we designed the new primer AH3R (5'-TAAAGGTGTCCGTCTGGATA-3') that was used together with A10d primer to generate an 8-kb fragment from gene 10 to the point of hybridization of the AH3R primer. This fragment was sequenced by primer walking, and one of the primers generated for this purposes, Awd6 (5'-CTCAACGGTAAAGTTATGGAT-3'), was used with A18R primer to obtain 7-kb fragment that was partially sequenced by the primer walking from Awd6 primer to the beginning of the N terminus of the gene 13 homolog.
The primers used to amplify the RB 49 wac sequence for cloning in the expression vector were Wstart-NcoI (5'-GGGTGCGCCCATGGTTGAACAATTAAACATCC-3') and WBamHI-rev (5'-TCTTGGGATCCTGGTTACTATCCTGCA-3'). The restriction sites are in boldface. For the C-truncated deletion variant of RB49 wac (RBW-N), we used the primer combination Wstart-NcoI and F-stop-W, which introduces a BamHI site (5'-CTCAAGGGATCCTCAAATATCTGCGGTATTTTTAAATGC-3'). To create the N terminally truncated deletion variant of the same gene (RBW-CE), we used the primer WBamHI-rev and the primer WEdir that introduces an NcoI site (5'-CTCTAGCCATGGGTATTAAAGTTGTAGAAAACACTC-3'). To clone the phage 42 wac into an expression vector, we used the primers PCW-NcoI (5'-GGGTTCAGCCATGGAAATTCTTCCATTTGTAAATAGC 3') and PCW-BamHI (5'-ACCTCGATGGATCCGAAGGCCCTAA-3').
To amplify, clone, and to sequence gene 36 of the phage T4, we used the primers T436NcoD (AAGGGGCATACCATGGCTGA) and T436BamR (CATGGATCCTCTTAATAATAG CCGA). To create the amber mutation amS197 in gene 36 of phage T4, we used the primers T436CamD (ATCCTGCATAGCAACCCTCACA) and T436CamR (GAGGGTTGCTATGCAGGATTCT).
Random sequencing procedure. To obtain the sequences flanking the regions of interest in the T4-type genomes, we constructed random genome libraries and sequenced about 30 fragments per genome. To construct the library, the samples containing 1 to 5 µg of the phage genome DNA were digested simultaneously with EcoRV and SspI restriction enzymes or, alternatively, with HaeIII. All of these enzymes produce blunt-ended fragments. The DNA was purified from restriction mixtures by using a QuiaGene PCR purification kit and incubated with 3 U of Taq polymerase per 10 µl of reaction volume in the presence of 0.2 mM ATP. This treatment leads to a template-independent addition of the single A residue at the 3' termini of the fragments. The fragments were then cloned according to the instructions of supplier by using the T-System kit for direct cloning of PCR fragments (Promega). The plasmid insertions obtained were sequenced by using standard M13 sequencing reverse primer. In some cases when large insertions were obtained, additional sequencing from opposite end of the insert was done by using the M13 sequencing primer.
The GenBank accession numbers of the nucleotide sequences were as follows: AY266304, B. cepacia phage 42 fibritin (wac) gene and a part of gene 13; AY266305, enterobacterium phage RB43 fibritin (wac) gene; AY266307, enterobacterium phage RB43 genes 36-37.1, 36-37.2, and 38. The sequences from phages RB49 and Aeh1 are now included in the complete genome sequences of these viruses, (AY343333 and AY266303, respectively).
Cloning, protein expression, and protein analysis. The cloning, expression, and protein analysis by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) were done as described by Sambrook et al. (31)
Purification of proteins. The same method was used for purification of the phage RB49 gpwac and of its variants. Competent cells (Novagen) of E. coli BL21(DE3) were transformed with the appropriate chimeric plasmid. Then, 100 ml of the Luria-Bertani medium plus 0.1% glucose was inoculated with a transformed colony, and the cells were grown at 37°C with agitation for 3 h. IPTG (isopropyl-ß-D-thiogalactopyranoside) was then added to 1 mM, and the incubation was continued for 2.5 h. The cells were then collected by centrifugation at 6,000 rpm for 15 min and resuspended in 5 ml of 50 mM Tris-HCl (pH 7.5). Lysozyme and DNase were added to final concentrations of 300 and 5 µg/ml, respectively. The extract was frozen and thawed three times, incubated at 20°C for 30 min, and then treated with an ultrasonic disintegrator for 1 min. Phenylmethylsulfonyl fluoride was then added to a final concentration of 1 mM, streptomycin sulfate was added to a final concentration of 3% (wt/vol), and the extract was placed on ice for 30 min. The nucleic acids and cell debris were removed by centrifugation at 10,000 rpm for 15 min. (NH4)2SO4 was slowly added to a final concentration of 25% saturation with continual magnetic stirring. After 30 min of further incubation on ice, the precipitated proteins were obtained by a 15-min centrifugation at 10,000 rpm, and the pellet was resuspended in 1 ml of 50 mM Tris-HCl (pH 7.5). The supernatant's concentration of (NH4)2SO4 was then increased to 35% of saturation or until the turbidity increased notably, and the solution was incubated for a further 30 min on the ice. After 20 min on ice the protein pellet was centrifuged and collected as described above. The supernatant was carefully removed with a Pasteur pipette, and the pellet was redissolved in 1 ml of 50 mM Tris-HCl (pH 7.5).
Further purification was performed, in some cases, by chromatography on a hydroxylapatite column. The column containing 3 to 4 ml of hydroxylapatite equilibrated overnight in 10 mM sodium phosphate buffer (pH 7.6) was loaded with 0.5 to 1 ml of the partially purified protein solution, and the protein was eluted by using the following step gradient of sodium phosphate buffer concentration: 10, 20, 50, 100, and 200 mM. All fractions were then analyzed by SDS-PAGE.
Circular dichroism of wild-type phage RB49 wac protein. Circular dichroic spectra were recorded at 25°C with a Jobin-Yvon VI dichrograph. A cell with a 1-mm optical path length was used to record the spectra of purified recombinant protein preparations at a peptide concentration of 0.4 mg/ml in 10 or 50 mM phosphate buffer (pH 7.5) in the UV region (260 to 190 nm).
Mutagenesis of phage T4 gene 36. To obtain the gene 36 amber mutation amS197, the N-terminal and C-terminal parts of the gene were amplified separately by PCR with primers T4g36CamR and T436NcoD for the N-terminal fragment and primers T4g36CamD and T436BamR for the C-terminal fragment. These PCR fragments were purified and mixed, and an additional 10 cycles of PCR were performed with the primers T436NcoD and T436BamR. The 6.6-kb fragment obtained was treated with Taq DNA polymerase in the presence of dATP and cloned into the pGEM-T vector. Sequencing verified that the amber mutation introduced by the primers T4g36CamR and T4g36CamD was present in this plasmid. This plasmid was used for transformation of the E. coli BE strain. A culture of the plasmid-containing strain was infected with T4D+ phage, and the titer of the lysate was determined on strain CR63, which contains an amber suppressor. Individual plaques were then replicated with toothpicks to lawns of strains CR63 and BE. About 1% of the progeny phages grew well on CR63 but not on BE. Phage stocks were grown from two such plaques, after repeated single plaque isolation. The gene 36 sequence of these phages was PCR amplified and then sequenced to confirm the presence of the mutation, which was named T4g36amC1.
Test of the ability of the phage mutants to grow in presence of PEG. A liquid culture of E. coli BE was grown for 2 to 3 h with vigorous agitation to a cell density of ca. 107, and a sterilized 30% solution of polyethylene glycol (PEG) 6000 was then added to a final concentration of either 5 or 6.4%. The culture was inoculated with phage at a multiplicity of infection of 30 to 50 PFU/cell. The incubation was continued at 37°C with agitation until the lysis of the culture was observed.
| RESULTS |
|---|
|
|
|---|
-helical coiled-coil domain (Fig. 1). The atomic structure of its N-terminal globular domain, which interacts with the phage neck, has recently been determined (5). The foldon of the C-terminal domain is required for the assembly and folding of the trimeric T4 gpwac fibril. The structure of truncation mutants containing only the last portion of the coiled-coil domain has also been determined at the atomic resolution level by X-ray crystallography (34, 36).
|
Genomic sequence of the region from g10 to g13 of the T-even, PseudoT-even, and SchizoT-even phages. Using a variety of consensus and unique PCR primers (see Materials and Methods), we isolated and sequenced the genomic segment (g10 to g13) that contains the wac gene of diverse T4-type phages. These included a number of T-even phages, as well as the PseudoT-even phages RB49, 42, RB42, and RB43. Using a genome scanning technique (12), we also obtained the sequence of the gpwac region of the SchizoT-even phage genome Aeh1. In all cases, sequences homologous to the T4 phage genes 11, 12, and wac were found in the segment between the genes 10 and 13.
Conservation of the N-terminal domain of gpwac. The N-terminal domain of the wac gene was conserved in all of the phages we analyzed. For example, the N-terminal domain of phage 42 wac has 48% sequence identity to T4 wac, whereas the downstream coiled-coil domain has just 27% identity (Fig. 2).
|
-helixes in a coiled-coil structure are strained, so that there are only 3.5 aa per helix turn rather than the normal 3.6 aa per turn. As a consequence, the residues a and d in the heptad (indicated in Fig. 2 by yellow shading) form a hydrophobic band along one side of the helix. In the intertwined trimeric helical structure, these residues form a hydrophobic core that stabilizes the fibril (Fig. 1, right panel). In T4 gpwac, there are 13 sequence segments that could form coiled-coil domains. However, this T4 coiled-coil structure is interrupted by looped-out sequences (33) that lack the heptad periodicity; these loops frequently begin with a Gly residue and often contain Pro residues. In T4 the structure and position of only the last two C-terminal loops has been confirmed by X-ray crystallography (36). The amino acid sequences of the coiled-coil domain of the Pseudo-even and SchizoT-even gpwac's diverge substantially from those of T4, although the heptad periodicity is well conserved (Fig. 2). Furthermore, the positions of the interruptions in this periodicity can usually be aligned to the loop positions predicted in the T4 gpwac (33, 36). In the gpwac of phage 42 the T4 loop 11 is deleted (Fig. 2) and in phage Aeh1 there are additional perturbations of the loop-coil pattern.
The divergence of the central fibrous domain of the gpwac protein is consistent with this part of the protein not being involved in strong protein-protein interactions. The only constraints on this sequence appear to be the periodicity of the hydrophobic residues. The presence of different hydrophobic amino acid residues in position a and d of the heptad can influence the number of
-helixes forming the superhelix, but the C-terminal domain may superimpose the trimeric fiber organization. Thus, the amino acid interactions within the coiled-coil domain could be less important in determining the oligomeric structure of the fibril than in polypeptide structures that form without a nucleating foldon domain. This could also explain why T4 gpwac has significantly more coiled-coil structure than is predicted by the COILS program (http://www.ch.embnet.org/software/COILS_form.html).
To confirm that the wac fibrils from one of the distant T4-type phages were composed of gpwac trimers, we performed in vitro refolding experiment with an equimolar mixture of a gpwac extract obtained from RB49 wild type and from a small wac-deletion mutant. The SDS-PAGE analysis of the unheated refolded complexes (see below) indicated that, as in T4, the RB49 fibrils are composed of gpwac trimers (data not shown).
Similarly, we were able to show that these novel wac genes were functional and are not pseudogenes. Antibody was made to the cloned Wac protein from phage Aeh1, and this was shown in a Western blot to interact with the same-size protein present in purified Aeh1 virions (data not shown). This result demonstrates that the Aeh1 wac gene is functional. It could still be argued that the more bizarre version of wac that we sequenced was not the functional gene copy and that another, functional, copy of wac might exist elsewhere in the phage genome. Recently, a number of T4-type phage genomes have been completely sequenced (H. M. Krisch and J. D. Karam, http://phage.bioc.tulane.edu/index.html), including those analyzed here. No duplication of wac was found in any of these phages.
Modular swapping of the fibritin C terminus in phages phylogenetically distant from T4. Since the 30-aa C-terminal domain of the T4 gpwac is required for the initiation of trimerization and folding (14, 23), this segment was expected to be conserved among all gpwac homologs. The C-terminal amino acid sequences of the wac gene from 13 closely related T-even phages were virtually identical to that of the T4 protein (sequence data not shown). However, all of the phylogenetically distant (<80% amino acid identity) T4-type phages had unrelated C-terminal domains (Fig. 3). In phage RB49, for example, the homology with T4 gpwac ceases after the residue Trp476 (T4 wac sequence coordinates), and the terminal 10 aa are replaced by a 143-aa sequence of unknown origin. In phages RB42 and RB43, at the residue aligned with Gln455 in T4 gpwac, there is a 323-aa insertion with weak homology to proteins containing an immunoglobulin-like fold domain (17). BLAST analysis (data not shown) of this novel 323-aa extension reveals that it actually contains three imperfect tandem repetitions of such an immunoglobulin-like domain.
|
The SchizoT-even phage, Aeh1, encodes an extremely large gpwac (1,035 aa) and only the N-terminal 423 aa of this protein show homology with the T4 sequence. The long C-terminal domain of the Aeh1 gpwac has weak homology (23% identity) with the putative host specificity determinants of some E. coli prophages. This large chimeric wac gene is followed by two open reading frames (ORFs) of 534 and 201 aa, both of which are database orphans. The published electron micrographs of this phage (38) reveal fibrous structures running along the contractile tail to the baseplate. The quality of these images is not sufficient to conclude a possible relationship between these fibers and the collar structure. Nonetheless, the sequence data alone suggest that the structure and function of the Aeh1 gpwac protein may differ from that in T4. This possibility needs to be explored, as does the organization of the Aeh1 tail fibers. Such studies are now in progress as part of the sequence analysis of the complete Aeh1 genome (Krisch and Karam, http://phage.bioc.tulane.edu/index.html).
The sequences reported here, together with the previously published sequence of phage T4 gpwac, reveal at least five distinct types of C-terminal domains (Fig. 3).
Overexpression and biochemical characterization of the RB49 wac protein and its deletion derivatives. We have characterized the phage RB49 gpwac protein to determine whether it has biochemical properties similar to those of the T4 protein. We first determined whether this protein mediates its own trimerization as does the T4 gpwac. To do this, we constructed an expression vector pWRB49, which allows high-level expression of the intact RB49 gpwac under the control of the T7 promoter. To determine whether this protein could be correctly folded, we used the assay previously developed for the T4 homolog (14, 23). Correctly folded T4 gpwac has a series of properties that make it easily identifiable. It is soluble and only precipitates at relatively high (NH4)2SO4 concentration (>30% of saturation), but the precipitate can be redissolved easily in a low-salt buffer. Properly folded gpwac is eluted by low phosphate (10 mM) on hydroxylapatite chromatography. When lysates of cells overexpressing T4 gpwac are separated by SDS-PAGE without prior heating of the sample, the protein remains trimeric and migrates as a low-mobility band. This protein is resistant to trypsin digestion even at enzyme/protein ratios of 1:20. T4 gpwac mutants that are unable to correctly fold have a tendency to aggregate and to form insoluble inclusion bodies. They precipitate at low (18 to 22% of saturation) concentrations of (NH4)2SO4 and are hard to redissolve. They interact strongly with hydroxylapatite columns and cannot be eluted by 500 mM phosphate buffer. The protein is sensitive to trypsin digestion and does not form a characteristic low-mobility band when subjected to SDS-PAGE.
gpwac is well expressed from the pWRB49 plasmid (constituting
30% of the total cell protein). This overexpressed gpwac protein is completely soluble and, when denatured, its mobility on SDS-PAGE corresponds to the size expected for a monomer (
65 kDa). When samples containing gpwac were not heat denatured prior to electrophoresis, a much lower mobility band was detected upon SDS-PAGE (Fig. 4). The overproduced gpwac protein precipitates at an (NH4)2SO4 concentration of 35% saturation but can be easily redissolved in 50 mM Tris-HCl (pH 7.5). Furthermore, it can be eluted from hydroxylapatite in 50 mM phosphate buffer. Treatment of the crude lysate with trypsin for 30 min at 37°C does not have any effect on the protein that can be detected by SDS-PAGE. However, if the sample was first heated for 1 min at 95°C before addition of the enzyme, it was completely degraded (data not shown). A circular dichroic spectrum of RB49 gpwac purified by hydroxylapatite chromatography allowed us to estimate the protein's secondary structure. The gpwac structure was partitioned as follows: 47%
-helix, 18% ß structure, 13% turns, and 22% other structures. This percentage of
-helical structure is compatible with the RB49 and the T4 gpwac having coiled-coil domains of similar size.
|
-helical coiled-coil domain. The functions of the various domains of the wac protein were dissected by deletion analysis. We constructed a deletion mutant (RBW-C) of RB49 fibritin that lacks the first 358 aa of this protein. The mutant makes a 238-aa polypeptide from a translation initiation sequence inserted in phase at residue 359. Based on the structural model for gpwac of T4 (Fig. 2), this N terminally truncated protein should contain the last three coiled-coil segments and the C-terminal globular domain. The overexpressed RBW-C protein is soluble and, after denaturation, its mobility in SDS-PAGE corresponds to
20 kDa. In unheated protein samples this gpwac deletion protein migrates as an
47-kDa protein (Fig. 4). Precipitation of the RBW-C protein requires an (NH4)2SO4 concentration of 37% of saturation. The protein is trypsin resistant, as is the intact RB49 fibritin. These data clearly indicate that RBW-C has a structure similar to the intact gpwac. Thus, as for T4, the N-terminal domain is not required for the RB49 protein to fold into the coiled-coil structure. We have also constructed a RB49 mutant (RBW-N) that produces a fibritin that is terminated by a TGA stop codon 20 aa downstream of the region of homology to T4 fibritin. This C terminally truncated protein, unlike the RBW-C protein, is totally insoluble and forms inclusion bodies. No SDS-resistant high-molecular-weight complex is present in unheated samples of this extract, and the protein was not resistant to trypsin. These results indicate that the RBW-N protein does not fold properly and that the absence of the C-terminal domain is probably responsible.
gpwac of phage 42. The gpwac of the phage 42 is interesting because it lacks the C-terminal foldon domain and contains no obvious replacement. In the initial analysis of phage 42 gpwac, we simply determined whether the polypeptide could be correctly folded when overexpressed in E. coli. The wac gene was amplified by PCR and cloned into the pET11d vector. E. coli cells overproducing this protein had inclusion bodies, and in unheated samples subjected to SDS-PAGE there was no band corresponding to a defined oligomeric species of the protein. Although the phage 42 fibritin is not being folded correctly under our conditions, it may be properly folded in B. cepacia either by a host- or phage-encoded chaperone or because of the intracellular conditions during infection. Another possibility is that even a much-reduced level of self-folding of the coiled-coil domain produces sufficient active gpwac to make infective phage 42 (V. P. Efimov, unpublished data; see also reference 4). Phage 42 grows very poorly and, for this reason, we have been unable to verify the presence of gpwac in the virion.
Target of gpwac interaction with the LTF. The geometry of the T4 phage particle (Fig. 1) suggests that the region near the tip of the wac fiber interacts with the knee region of the LTF, implying that the end of g34, g35, or g36 probably encodes the wac-binding site. The existence of a mutation in g36 with a phenotype very similar to the original wac mutant (15) lends support to a direct interaction between gpwac and gp36. If this hypothesis is correct, then distant T4-type phages with novel C-terminal wac sequences should also have novel motifs within their LTFs. Therefore, we focused our analysis on the LTF sequences of the T4-type phages that were known to have different versions of gpwac. For example, the LTF locus of the PseudoT-even bacteriophage RB49 had been sequenced previously (12). Interestingly, only the N-terminal portion of the g36 homolog retains homology to the T4 gp36 sequence (12); the C-terminal sequence is novel (12) and substantially longer than in the T4 protein. We also examined the LTF region of the phages RB43 and Aeh1 by first performing a genome scan to localize the LTF locus (11) and then sequencing this segment by primer walking. When a 10-kb fragment encompassing the entire LTF locus of RB43 bacteriophage was analyzed by BLASTX, it was found to encode proteins homologous to T4 gp34 and gp35 (Fig. 5), followed by two large ORFs of 756 and 816 aa. Both of these sequences begin with a short segment with good homology to the N-terminal sequence of T4 gp36, and the remainder of the ORF sequence is related to the central and C-terminal portions of the gp37 of various T-even bacteriophages. The sequences of these two ORFs are related but not identical. A putative transcription terminator and a putative late promoter separate them. Finally, downstream of the putative g36/g37 fusion proteins is a 174-aa ORF whose first 45 aa are 73% identical to the N-terminal sequence of the gp38 adhesin of T2 bacteriophage.
|
Genetic analysis of the interaction of gp36 and gpwac. The alignment of the gp36 amino acid sequences of the T-even bacteriophages and the PseudoT-even phage RB49 (Fig. 6) reveals a conserved GDT(M/L)TG motif (aa 179 to 184 in T4) that separates the C-terminal domain extension of RB49 from the part of gp36 homologous to the T-even phages. In the T-even phages, this motif is followed by a conserved 37-aa sequence containing four Pro residues. Such an abundance of Pro residues is incompatible with this part of T4 gp36 forming a compacted fibrous structure like the proposed ß-helix structure of the LTF. Thus, the GDT(M/L)TG motif could be the boundary between the fibrous part of gp36 and a C-terminal globular domain mediating the gpwac-LTF interaction. To test this hypothesis, we created the mutation in gene 36, amS197, that removes the last 24 residues of the gp36. If this C-terminal sequence is only involved in gp36's interaction with gpwac, then such a mutant should be partially viable in the nonsuppressing host strain E. coli BE. A plasmid carrying this mutated g36 sequence was created as described in Materials and Methods. A culture of the plasmid-containing E. coli CR63 (su+) cells was used to prepare a stock of wild-type T4 phage that contained rare recombinants with the plasmid mutated g36 sequence. This phage stock was plated on CR63 (su+), and 150 individual plaques were tested on BE (su) to identify recombinants having reduced viability on this host. Two plaques thus identified were purified by single plaque isolation on CR63 (su+), and stocks were prepared on the same host. DNA sequencing confirmed that both of these phages had the expected g36 amber mutation (named T4g36amC1). The mutant T4g36amC1 was inviable in host strains that did not suppress this nonsense mutation. Thus, the C-terminal part of the gp36 must provide some indispensable function, perhaps being required for the correct folding of the entire protein or its interaction with gp37.
|
wac mutant constructed by V. Mesyanzhinov. This mutant resulted from a genetic exchange between the T4 mutant am12am13 and a plasmid containing the gene 12-wac-13 segment of the T4 genome but with a large HindIII-HindIII deletion within the wac gene. No PEG resistance selection was used to isolate this wac deletion phage. The
wac mutant makes small plaques, as do the other wac mutants. When we compared the growth of T4 wild-type and
wac phages in 6.4% PEG, we unexpectedly found that neither phage was able to lyse a culture in 5 h. Nonetheless, the control experiment with two PEG-resistant phages obtained by using a slightly modified version of the PEG selection procedure of Follansbee et al. (15) gave complete lysis in 2 to 2.5 h. We then repeated this experiment with only 5% PEG and, under these less restrictive conditions, although the wild-type phage does not lyse the culture after a 4-h incubation, both the
wac mutant and the gene 36 mutant (T4g36W) lyse cells in 2 h. Thus, the wac-null mutant is only partially PEG resistant; it can adsorb efficiently in 5% PEG but not in 6.4% PEG. "Full" resistance to PEG of some wac-null mutants (15) appears to be due to additional genetic lesions selected in presence of high levels of PEG. These data are fully compatible with our suggestion that the C-terminal part of the gp36 (or the corresponding segment in the g36-37 fusion proteins in phages with LTF organization such as RB42) is the site of interaction with fibritin. | DISCUSSION |
|---|
|
|
|---|
The complete genome sequence is now known for six T4-type phages (see http://phage.bioc.tulane.edu/index.html). Although the acquisition, deletion, and exchange of small nonessential genetic elements appear to be frequent (11, 12, 19, 22), there is no evidence for the exchange of large modules with essential functions as in the phage
family. Modular swapping among the T4-type phages has been convincingly documented in only a few genomic loci involved in host range adaptation (30, 32, 37). One of these involves the tail fiber genes (32, 37). In the LTF locus, modular shuffling requires only limited regions of nucleotide sequence homology and can occur between very distant or even unrelated phages (16, 32). Since the structures of these fibrous proteins do not involve long-range tertiary interactions, there are probably few sequence constraints on their genetic shuffling with other fibrous sequences of the same type. The mutual compatibility of sequences encoding fibrous structures, coupled with the strong selective advantage derived from acquisition of novel adhesin domains (37), probably suffices to explain the frequent modular shuffling within the LTFs.
The modular exchange of the C-terminal domain of gpwac documented here provides an additional clear example of a modular replacement in a T4-type phage structural gene. The functional and structural similarities of the fibers encoded by wac and the LTF locus indicate that their sequence plasticity may be the consequence of related mechanisms. Although gpwac belongs to a different structural class of fibrous proteins (
-helical) from that of tail fibers (ß-structured), it does have a receptor recognition-like function, binding specifically to a component in the middle of the LTFs. The most probable explanation for the plasticity of the gpwac C-terminal region is the variability of the LTF target sequence to which it must bind.
In the T-even bacteriophages, rearrangements of LTF genes occur frequently within the g37 and g38 sequences (37, 40), which encode the most distal portion of the tail fiber (Fig. 3) and the proximal fiber components (gp34, gp35, and gp36) are relatively conserved. The genetic plasticity in the distal part of the LTF probably results from the fact that swapping in this region can lead to advantageous changes in viral adsorption specificity. Interestingly, the protein-protein interaction between the LTFs and gpwac could exclude exchanges that extend into the proximal portion of the LTFs, since such exchanges would require a coordinated alteration in the binding specificity of the wac gene that is in another region of the genome.
The reorganization of LTF regions that we observed in the bacteriophages depicted in Fig. 5 significantly alters gene 36 (or its functional equivalent). This observation, coupled with the correlated swapping of gpwac C-terminal domains in these phages, is compatible with our explanation that the domains of these two proteins interact. However, the apparent correlation of the plasticity in gpwac and gp36, by itself, does not allow us to rigorously conclude that the changes in the C-terminal domains of gpwac are necessitated by the alterations in g36 structure, although the ensemble of our data supports such a conclusion. The examination of the structure and function of a series of gp36 proteins certainly leads to a similar suggestion. Based on the alignments of these gp36 sequences, we predicted that the C-terminal region of this protein is a probable target of interaction with gpwac in T4 phage and, furthermore, we showed that a point mutation in this region of gp36 has a phenotype identical to that of the wac deletion. Ultimate proof of the physical interaction between the C termini gp36 and the gpwac requires the isolation of intergenic suppressors of our g36 mutants and the demonstration that these are located in the predicted region of the wac gene. No such mutants have yet been isolated, but attempts to obtain them are in progress.
Our observation that phage carrying a null mutation in the gpwac protein have only partial resistance to PEG contradicts the original findings of Follansbee et al. (15) that wac-deficient mutants are resistant to 6.4% PEG. Nevertheless, we were able to obtain phages completely resistant to high concentration PEG by a slightly modified adaptation of their isolation procedure (15; data not shown). The simplest explanation for the difference between our results and those of Follansbee et al. is that their "wac" mutants had a secondary mutation that augmented the level of the phage's resistance to PEG. Such multiple mutants could have been erroneously identified as having single mutation in the wac gene. The observed partial phenotype of
wac, as well as of our g36 mutants, suggest that the environmental sensing function might actually occur in two steps. The LTF could interact only transiently with fibritin and this interaction facilitates (but is not absolutely required for) a much more stable interaction between LTF and some other phage component (the most probable candidate being the tail shaft protein gp18). This second interaction could play a key role in the sensitivity to PEG and to other environmental agents that have long been known to influence phage infectivity, such as tryptophan, indole, and potassium glutamate (6, 21). It would be extremely interesting to fully characterize the mechanism by which such environmental agents activate or inhibit the adsorption of various T4-type phages strains. In the case of tryptophan or indole (T4B and T2H), it seems likely that a protein-protein interaction between the tail sheath (gp18) and the LTF is responsible. In other examples of environmental sensing (pH, ionic strength, and temperature) by the T4-type phage, gpwac is involved. As one considers progressively more distant T4-type phage, the environmental sensing functions may become more exotic. For example, it is known that the T4-type phage S-PM2 only infects its host, the photosynthetic cyanobacterium Synechococcus spp., under conditions of high illumination (N. H. Mann, unpublished data). A light sensitive interaction between the LTFs and another virion component such as gpwac could explain such sensing.
While the present study was in preparation, the genome sequences of the PseudoT-even phage 44rr2.8t (Krisch and Karam, http://phage.bioc.tulane.edu/index.html) and SchizoT-even KVP40 (25) appeared. The sequences of the wac genes of these phages are fully compatible with all of the conclusions we have drawn here about the domain structure of gpwac. In particular, these additional gpwac sequences both contain novel C-terminal domains that are not related to any of those described here, and in both cases significant rearrangements can be observed in the genes coding for the distal half of the LTFs.
In conclusion, the present study has used phylogenetic, genetic, biochemical, and biophysical methods to analyze the structure, function, and evolution of the T4-type phage collar fiber, gpwac. Sequence comparisons convincingly argue that the wac gene has a modular construction. Surprisingly, the sequences of the C-terminal domain of gpwac are completely unrelated in a number of diverse T4-type phages. This implies a large gene pool of alternative modules encoding the gpwac C-terminal domain. This portion of the wac protein has two functions in T4: it initiates the folding of gpwac subunits into a coiled-coil structure, and it binds specifically to the LTFs. In the mature virion, the gpwac-LTF interaction contributes to holding the LTFs in a retracted conformation that renders the phage noninfectious. The retraction of the LTFs is sensitive to ambient conditions such as ionic strength, pH, and temperature and, as a consequence, T4 infectivity can be adapted to its environmental situation. The considerable natural diversity of the gpwac C-terminal domains suggests that they could encode different environmental sensing capacities. These results justify a more profound examination of the sensing mechanisms used in the diverse T4-type phages to adapt their infectivity to the environmental conditions.
| ACKNOWLEDGMENTS |
|---|
This research was supported by the CNRS and by grants from the Ministère de la Recherche (PRFMMIP and ACI-Microbiologie) and the GIP HMR. The CNRS-IFR109 and the Toulouse Genopole provided funding for the DNA sequencing facilities. A.L.s research in Moscow is supported by a grant from the Russian Foundation for Basic Research (RFBR) (04-04-48789).
| FOOTNOTES |
|---|
Present address: S. N. Winogradsky Institute of Microbiology, Russian Academy of Science, 117312 Moscow, Russia. ![]()
Present address: Centre de Biochimie Structurale, CNRS-UMR 9955, INSERM-U554, Universite de Montpellier I, F-34090 Montpellier, France. ![]()
Present address: Départment de Biochimie Médicale, University Medical Centre, 1211 Geneva 4, Switzerland. ![]()
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||