Bacteriophage P22 Antitermination boxB Sequence Requirements Are Complex and Overlap with Those of λ

ABSTRACT Transcription antitermination in phages λ and P22 uses N proteins that bind to similar boxB RNA hairpins in regulated transcripts. In contrast to the λ N-boxB interaction, the P22 N-boxB interaction has not been extensively studied. A nuclear magnetic resonance structure of the P22 N peptide boxBleft complex and limited mutagenesis have been reported but do not reveal a consensus sequence for boxB. We have used a plasmid-based antitermination system to screen boxBs with random loops and to test boxB mutants. We find that P22 N requires boxB to have a GNRA-like loop with no simple requirements on the remaining sequences in the loop or stem. U:A or A:U base pairs are strongly preferred adjacent to the loop and appear to modulate N binding in cooperation with the loop and distal stem. A few GNRA-like hexaloops have moderate activity. Some boxB mutants bind P22 and λ N, indicating that the requirements imposed on boxB by P22 N overlap those imposed by λ N. Point mutations can dramatically alter boxB specificity between P22 and λ N. A boxB specific for P22 N can be mutated to λ N specificity by a series of single mutations via a bifunctional intermediate, as predicted by neutral theories of evolution.

and P22 share a closely related system of regulating the expression of early lytic genes by allowing transcription past terminators in the P left and P right operons (47). The recognition of the boxB RNA hairpins of nut (N-utilization) sites by the binding domains of the viral N proteins (Fig. 1A, B, and C) initiates the assembly of an antitermination complex. This complex contains N, host factor NusA, RNA polymerase, and other host factors bound to the viral nut site boxB and the nonhairpin boxA, allowing transcription to proceed through downstream transcription termination signals (32). The four wild-type nut sites of and P22 are similar in sequence but differ even between boxB left and boxB right in the same virus. Notably, both P22 boxBs have a C in the loop where both boxBs have an A, P22 boxB stems are longer than those of by 1 base pair, and P22 boxB right appears to have a noncanonical C:C base pair. boxBs bind noncognate N peptides poorly in vitro (2,8,44). Likewise, noncognate N-nut interactions do not function in vivo (14,28), and noncognate N proteins do not rescue N-deficient viruses (10).
The details of the N-boxB interaction have been revealed by extensive genetic and biochemical work (47) and by solution state nuclear magnetic resonance (NMR) structures of the arginine-rich domain of the N protein bound to boxB left (29) and boxB right (39) RNA. Genetic and biochemical studies of the P22 interaction are less complete (13,44) but are supported by the solution state NMR structure of the arginine-rich domain of the N protein bound to P22 boxB right (5). and P22 boxBs bind their N peptides as hairpins in which 4 of 5 bases adopt a GNRA-like tetraloop structure (Fig. 2). Tetraloops are frequently found in RNAs serving structural roles as stable caps to stems and as motifs recognized by proteins; GNRA tetraloops are noted as particularly frequent and thermodynamically stable (9). The and P22 boxB loops adopt distinct conformations: the fourth nucleotide (nt) of the boxB pentaloop is extruded from the GNRA tetraloop (4-out), while in the P22 boxB it is the third nucleotide that is extruded (3-out). As in the canonical GNRA tetraloop, there is a nonstandard, sheared G:A base pair and extensive stacking. The bound peptides adopt helical conformations, occupy the major grooves, and make contacts with boxB stems and loops.
NMR data support a model in which the N peptides recognize a specific conformation of boxB, with little direct recognition of the sequence (5,29,39). Most observed contacts are not sequence specific but are made to the backbone or are hydrophobic in nature. The only certain, base-specific hydrogen bonds detected by NMR are to the guanine in the sheared G:A pair at the base of the loop. No other proposed peptidebase hydrogen bond is supported by mutagenesis. Cai et al. (5) report possible hydrogen bonding between a P22 N lysine and the base pair adjacent to the loop of P22 boxB left , Legault et al. (29) propose hydrogen bonds between the N peptide and a base adjacent to the loop of boxB left , while Scharpf et al. (39) have evidence only for hydrogen bonds to the guanine of the sheared G:A base pair of boxB right . -P22 specificity may reflect conformational energetics, likely governed by subtle base-stacking interactions.
The regulatory role of RNA-protein recognition in important biological systems has elicited much interest (16). The diversity of known RNA-protein complexes raises the question as to which evolutionary mechanisms are capable of finding new recognition strategies. Kimura's neutral theory of molecular evolution contends that for any given genotype, there are sufficient mutants of neutral fitness to create smooth paths to different phenotypes (26). Though computational modeling of evolving RNA secondary structure strongly supports neutral theories (21,38,45), few experimental tests using biologically active RNAs have been reported. Studies of small hairpin RNAs that bind arginine-rich peptides have shown that single substitutions and base pair changes are enough to create changes of specificity (23,41), suggesting that neutral paths between distinct activities can be found when sequences are small. In the case of protein-binding RNAs, mutations leading from relaxed-specificity sequences to sequences with distinct binding preferences would be equivalent to speciation.
We examined the requirements of boxB recognition by P22 N using a plasmid-based reporter system that reconstitutes antitermination in Escherichia coli (14). boxBs active with P22 N were obtained from screens of two randomized loop libraries. Assays of selected and designed boxBs reveal that P22 N recognition requires a GNRA-like pentaloop with no simple requirements on the remaining sequences in the loop or stem. The base pair adjacent to the loop is restricted and appears to modulate N binding in cooperation with the loop and distal stem. A few GNRA-like hexaloops were found to have moderate activity. Single-nucleotide mutations alter boxB specificity and connect boxBs specific for P22 and N proteins, as predicted by neutral theories of evolution.

MATERIALS AND METHODS
General. Restriction enzymes, T4 polynucleotide kinase, and T4 DNA ligase were obtained from Roche (Germany). Bacterial-medium components were obtained from Oxoid (United Kingdom). Fine chemicals were obtained from Amersham (United Kingdom), Sigma (United States), and Amresco (United States).
Construction of boxB reporter and library plasmids. Wild-type sequences of (accession no. NC_001416) and P22 (accession no. NC_002371) were obtained from GenBank. Each reporter plasmid was constructed with the boxB of interest replacing only the boxB of the nut left site, such that differences between reporters were confined to boxB. boxB cloning oligonucleotides were designed to form double strands with PstI and BamHI overhangs to replace the entire nut site of pACnutTAT13. The underlined regions in the oligonucleotides used to construct nut left indicate the boxB replacement sequences (LLf, 5Ј-GTCGAC GCTCTTAAAAATTAAGCCCTGAAGAAGGGCAGCATTCAAAGCAGG-3Ј; LLr, 5Ј-GATCCCTGCTTTGAATGCTGCCCTTCTTCAGGGCTTAATTT TTAAGAGCGTCGACTGCA-3Ј). These pairs of oligonucleotides were annealed and ligated into the pACnutTAT13 backbone previously digested with PstI and BamHI. Library inserts were constructed by primer extension with an antisense primer (5Ј-CGGGGATCCCTGCTTTGAATGC-3Ј) on sense templates (the 7-nt library, 5Ј-GGGCTGCAGTCGACGCTCTTAAAAATTAATG CGCNNNNNNNGCGCGAGCATTCAAAGCAGGGATCCCCG-3Ј, or the 6-nt library, 5Ј-GGGCTGCAGTCGACGCTCTTAAAAATTAATGCGCTNN NNNNAGCGCGAGCATTCAAAGCAGGGATCCCCG-3Ј), followed by digestion with PstI and BamHI. The digested double-stranded library oligonucleotide was ligated to the backbone, and the ligation mixture was used to transform competent DH5␣ bacteria by electroporation. The resulting colonies were scraped from the plates and used to prepare the plasmid for screening.
P22 N fusion-expressing plasmid. To construct the P22 N fusion supplier plasmid pBRNP22N, a pair of synthetic DNA oligonucleotides (P22NF, 5Ј-C ATGTTTGCAGGCAATGCTAAAACTCGCCGTCATGAGCGGCGCAGAA AGCTAGCCATAGAGCGCAATGCA-3Ј, and P22NR, 5Ј-CATTGCGCTCTA TGGCTAGCTTTCTGCGCCGCTCATGACGGCGAGTTTTAGCATTGCCT GCAAA-3Ј) were designed with NcoI and BsmI overhangs to replace the N terminus of the N protein in pBR-ptac-N* with the RNA-binding domain of P22 N. These oligonucleotides were annealed and ligated into plasmid pBRptac-N* backbone to form pBRNP22N. Ligations were transformed into N567 strains of E. coli, and minipreps were selected for each construct.  supplier-containing cells and plated on tryptone agar containing X-Gal (5-bromo-4-chloro-3-indolyl-␤-D-galactoside) to reveal the general phenotype. At least three representative clones were restreaked, and fresh plasmid was prepared and retransformed to confirm the phenotype prior to large preparations for sequencing. boxB reporter plasmids were sequenced with the forward primer PACF2 (5Ј-AATCACTGCATAATTCGTGTC-3Ј), and the P22 N fusion supplier plasmid pBRNP22N was sequenced with PBRNF2 (5Ј-ACTCCCGTTCTGGATA ATG-3Ј) and PBRNR2 (5Ј-GGCTTGCTGTACCATGTG-3Ј) using an ABI Prism BigDye Terminator v3.1 ready reaction cycle sequencing kit on an ABI 3100-Avant genetic analyzer. The entire insert was confirmed using Chromas Lite software from Technelysium (United Kingdom). Library screening and X-Gal plate assay. Competent N567 host cells were transformed with the wild-type N supplier (pBR-ptac-N*), the P22 N fusion supplier (pBRNP22N), or the HIV-Rev N fusion supplier (pBRN-HIVRev) by standard CaCl 2 procedures. For each N-boxB plate assay (36), individual boxB reporter plasmids (including the controls of boxBs, P22 boxBs, and HIV RRE reporters in parallel) were transformed into N, P22 N, and HIV Rev N supplier cells. Typically, 10 to 100 ng of plasmid per 100 l of competent cells was transformed by heat shock and plated on tryptone plates containing 50 g/ml ampicillin and 15 g/ml chloramphenicol as antibiotics and 80 g/ml X-Gal as the chromogenic substrate of the ␤-galactosidase reporter protein. Experiments with pBRNP22N and pBRNHIVRev also included 0.05 mM IPTG (isopropyl-␤-D-thiogalactoside) to induce the tac promoters expressing the N protein and the reporter transcript. The plates were viewed and scored after 1 and 2 days at 34°C. The intensity of the blue colonies was used to score the antitermination activity for selections and the preliminary assessment.
ONPG solution assay of antitermination. For each N-boxB interaction, representative colonies were picked from X-Gal plates for solution assays (31). At least three independent colonies were used for each interaction. For measurement of N-mediated antitermination, cultures were grown overnight at 37°C with aeration in tryptone with 50 g/ml ampicillin and 15 g/ml chloramphenicol as antibiotics and with 0.5 mM IPTG. The cells were then permeabilized and assayed for ␤-galactosidase activity using ONPG (o-nitrophenyl-␤-D-galactoside), and units of ␤-galactosidase were calculated according to Miller (31). The percentages of activities are reported normalized to P22 boxB left for P22 N and boxB right for N because the absolute activities of controls sometimes varied up to threefold between experiments on different days. The levels of activation of specific reporters were calculated as the units of ␤-galactosidase observed with activation by P22 N or N divided by the units observed by N Rev activation.
Structure visualization. Protein Explorer (30) was used to view the solution state NMR structure of N peptide-boxB right (Protein Data Bank accession no. 1QFQ [39]), the NMR structure of P22 N peptide-P22 boxB left (Protein Data Bank accession no. 1A4T [5]), and the X-ray crystal structure of a Haloarcula marismortui large ribosomal subunit (Protein Data Bank accession no. 1JJ2 [27]).

RESULTS
We first confirmed N-boxB activity and specificity using a two-plasmid reporter system of Franklin (14). This reporter system reconstructs N-mediated antitermination in E. coli and has extensively been used for the study of N-boxB interaction and heterologous RNA-protein interactions. Four wild-type boxB reporters ( Fig. 1A) were constructed in which and P22 boxBs were presented in the context of the nut left sites to avoid influence from variations in boxA and the remainder of the nut sites (Fig. 1B). N was expressed as the full-length protein, whereas the P22 N RNA-binding domain was expressed as a fusion to the activation domain (Fig. 1C). The P22 N fusion limited the differences between N proteins to the well-characterized RNA-binding domains and avoided the high level of background activity displayed by a full-length, wild-type P22 N construct (15; data not shown). The P22 N fusion supplier was tested and found to have low activity on a heterologous HIV RRE nut reporter and strong activity on its cognate boxB reporters as expected, though its absolute activity was substantially less than that of the N supplier. The P22-N fusion supplier plasmid rescues the P22 N-null phage (imm P22 Nam24 [13]) but not the N-null phage (Nam7 am53 [13]; data not shown). Thus, while the P22 N fusion activity likely directly reflects boxB-peptide affinity, the lambda N activity likely reflects the ability of the boxB loop to bind lambda N in the proper 3-out conformation that recruits NusA and allows optimal antitermination (49). The P22-N fusion supplier was used for all further experiments.
We measured the activities of the four wild-type boxB reporters with the N supplier pBR-ptac-N* and with the P22 N fusion supplier pBRNP22N using a solution assay with the ␤-galactosidase reporter gene (31) ( Table 1). The boxBs display specificity for their cognate N proteins; the P22 boxB right reporter displays very high specificity. In vitro experiments indicate substantial affinity between boxB left and the P22 N peptide (44) and between P22 boxB left and the N peptide (7). Other work in vivo shows that boxB right has significant activity with P22 N in vivo (13,19) and that P22 boxB right is highly specific (13,28). Despite the uniform context of the boxBpeptide interaction, the specificities displayed by our boxB reporters largely agree with published data, though there are unresolved discrepancies between published boxB-N peptide specificities in vitro (2,7,43,44).
Screening of a boxB library. In order to reveal the loop consensus for P22 boxB, we constructed a plasmid library To remove false positives, the reporter plasmid was separated from the N supplier plasmid and used to transform N567 Rev N, where constitutively active reporters (presumably resulting from mutations outside boxB) show activity and reporters specific to P22 N are inactive. White colonies (500; 64%) were selected and used to isolate the reporter plasmid. After one additional round of selection on N567 P22 N yielding colonies that were 55% blue, plasmids were individually examined for activity on X-Gal plates. Nineteen clones with activities similar to that of P22 boxB left were sequenced, yielding 14 unique sequences. These were assayed for antitermination (Table 2). Neither P22 boxB left nor any sequence resembling P22 boxB right was recovered, indicating that our screen was not exhaustive. Indeed, only 1 of the 14 selected was a boxB left point mutant.
Only GNRA-like pentaloops are found in a randomized loop library. As expected of GNRA-like pentaloops, all selected sequences contained a G 1 :A 5 pair ( Table 2). All four nucleotides occurred at positions 2, 3, and 4, violating a strict GNRAlike consensus, assuming that all boxBs adopt the expected 3-out conformation when bound. Purines are common at positions 2 and 4 in strongly active boxBs, suggesting that basestacking interactions are important. In the P22 boxB left NMR structure (5), A4 fulfils the purine role with an intramolecular hydrogen bond, but no prior mutagenesis in boxB had indicated its importance. Consistent with these data, GNRA-like pentaloops are known to exist without a purine in this role (12,42).
A series of mutant boxBs were constructed to explore the determinants of P22 N activity ( Table 3). The A 2 3G mutation of P22 boxB left (clone D1) does not reduce activity, as was expected from published in vitro data (44). The C 3 3A mutation (D2) reduces activity to 85%, while a G 2 3A C 3 3A double mutation (D3) reduces activity to 36%. In contrast, C 3 3A A 4 3G mutations (D4) maintain moderate to wild-type activity, as was also expected from published data (44). The low frequency of positive sequences, their similarity to GNRA tetraloops, and the lack of any observed hydrogen bonds between the peptide and bases are consistent with a model where the ability of boxB to adopt the required conformation depends on favorable base-stacking interactions (24), as well as  hydrophobic interactions with the N protein (4) and ancillary factors (49). A:U and U:A are preferred adjacent to the loop. The absence of a Watson-Crick base pair adjacent to the loop of P22 boxB right suggests either a tolerance of nonpaired bases or C:C pairing. The loop of phage 21 boxB adopts a structurally related conformation, and NMR-based molecular modeling suggests C:C pairing (7). Our library's random region encompassed these positions in order to examine their importance. Only U:A (wild type) and A:U base pairs were found in strongly positive clones, while the only boxB with a C:G base pair had reduced activity ( Table 2, reporter L13). To address the role of the bases adjacent to the loop, we constructed a series of boxBs with substitutions at these positions (Table 3, D5 to D12). The placement of A:U into boxB left did not reduce activity (D5), though it did affect boxB left mutants (D6, D7). Interestingly, C:C, C:G, G:C, and wobble pair replacements strongly reduced activity to 15% or less in the context of boxB left (D8 to D12).
How does boxB right tolerate the apparent C:C pair? The U:A substitution adjacent to the loop of boxB right increased activity (D13), indicating that U:A is also preferred in this context and suggesting that the two base pair differences between boxB left and boxB right at the beginning of the stem contribute to boxB right activity. The separate placement of two boxB right base pairs into a boxB left context (D14, D15) increased and decreased activity, respectively. These data suggest that the contiguous stacking from the beginning of the stem through the loop allows distal mutations to modulate the backbone conformations required for binding P22 N.
Screening a hexaloop library yields few active boxBs. Since the GNRA motif is found imbedded in larger loops (1), we constructed a library in which the 5-nt loop of boxB left was replaced with 6 random nt. This library should contain 4,096 unique sequences. Approximately 10,000 colonies were screened in N567 P22 N on X-Gal plates, and 287 blue colonies were picked, pooled, and screened against Rev N. Twelve colonies of the negative clones (90%) were collected and sequenced, yielding eight unique sequences. Of these, three were hexaloops and five were pentaloops. The unselected library pool was sequenced and showed no significant contamination with random pentaloops (data not shown), suggesting that active hexaloops are far rarer than pentaloops. Interestingly, the three selected hexaloops are all related by single insertions into high-activity boxBs (Table 4). A designed hexaloop (D16) and heptaloop (D17) were found to have minimal activity. We speculate that larger loops are more likely to accommodate extruded bases where they sterically hinder protein binding. Indeed, two hexaloops and one heptaloop with structures resembling GNRA tetraloops (20) found in a Haloarcula marismortui large ribosomal subunit (27) accommodated the extra nucleotides where they would interfere with N protein binding.
Mutations alter boxB specificity toward P22 N and N. The diversity of boxB variants active with P22 N suggested that some may also function with N. Though the recognition strategies of P22 and N are distinct, the origin of specificity is not obvious, and some sequences might be able to participate in both recognition strategies. Tan and Frankel (44) reported a boxB (equivalent to D4 here) that binds both P22 N and N peptides in vitro. We first designed and assayed some additional reporters with boxB stems (Table 5), which showed that the boxB stem did not abrogate P22 N activation. We measured the activities of all described boxB reporters with P22 and N proteins ( Table 6). Using the ratio of the percentage of P22 N to the percentage of N activity to represent specificity, a wide range of specificities is apparent (Fig. 3A). While a few boxBs had higher specificity to N than to wild-type boxBs, none were found to be more specific to P22 N than to P22 boxB right , possibly because most variants were in a P22 boxB left context and P22 boxB right has extremely high specificity. Remarkably, many boxBs have relaxed specificity and maintain at least moderate activity. Several boxBs selected for P22 N activity have more than 50% activity with N, suggesting that bifunctional boxBs may be relatively common.
Stem and loop position 3 make a clear contribution to specificity. To understand the basis for boxB specificity, nucleotide changes in different boxB contexts were examined. We observe that N tolerates boxBs with A 3 3C loop mutations (D18, D22). In contrast to those of other workers, our results suggest a Clones designated with L were selected from a randomized loop library, and those designated with D were designed boxB replacements. All boxBs are presented on a P22 boxB left stem in the context of the nut left site.
b Only loop and adjacent base pair sequences are shown; underlining indicates nucleotides different from those of P22 boxB left . that the purine role can be fulfilled by C 3 in the 4-out GNRAlike pentaloop (6,11). We cannot account for these discrepancies, but note that we have used a different reporter system and that under higher N expression, Doelling and Franklin report that the A 3 3C boxB right has moderate activity (11). In all other cases where comparisons can be made, our results agree well with published data. Interestingly, changing from C 3 to A 3 in every context shifted specificity from P22 to N activity, sometimes without a loss of the activity of one or the other N (Fig. 3A).
Published reports indicate that N is tolerant of mutations in the boxB stem so long as most of the base pairing is preserved (6,11,44). We find that exchanging a P22 boxB left stem with that of boxB shifted specificity toward N, again without a necessary loss of activity of one or the other N (Fig. 3B). These coordinate shifts suggest that position 3 and the stem directly contribute to specificity. In contrast, though U:A-to-A:U changes adjacent to the loop are tolerated by P22 N and N in their cognate boxBs, the base pair switch causes no consistent alteration of specificity in other contexts (Fig. 3C). These data suggest that this base pair has a complex role coupled with the stem and loop sequence.
Single-nucleotide mutations can switch specificity between P22 and N. Point mutants of wild-type boxBs have a wide range of specificities, including relaxed specificity (Fig. 3D). Relaxed-specificity boxBs suggest that transitions between P22 FIG. 3. Specificities of mutant boxBs. Representative boxB reporters in this study are plotted by their percentages of the activities with P22 N (abscissa) and N (ordinate) relative to those with P22 boxB left and boxB right , respectively ( Table 6). All clones are as in Tables 1 to  5 Tables 1 to 5. b The percentage of the cognate's activity is calculated as the activity of the reporter relative to that of the cognate reporter (P22 boxB left for P22 N and boxB right for N).
c The ratio of activity is the percentage of P22 N activity divided by the percentage of N activity and indicates specificity. and recognition strategies may occur without an intermediate loss of function. We observed no series of single mutations connecting wild-type P22 and boxBs, because they differ by a base pair in the stem that was not mutagenized. However, a single mutation transforms P22 N-specific D5 into moderately bifunctional D6, which is a single mutation from N-specific D7. Indeed, a single substitution connects P22 N-specific L4 to N-specific D7.

DISCUSSION
Our results show that P22 N recognizes GNRA-like pentaloops, without strict adherence to the purine requirement. P22 N can also recognize GNRA-like hexaloops with reduced activity. Many boxB variants are recognized, but no simple pattern emerges. The subtle effects of base stacking likely dictate the ability of the pentaloop to adopt the conformation necessary for P22 N binding. P22 N prefers U:A and A:U adjacent to the loop and tolerates the C:C pair of boxB right with the cooperation of distal-stem base pairs. Single mutations can dramatically alter boxB specificity between P22 and N. Though we found no path between wild-type boxBs, we did find that a series of single mutations connects P22 and Nspecific boxBs via a relaxed-specificity boxB, as predicted by neutral theories of evolution. Libraries were randomized only in the loop region, and screening was not exhaustive, suggesting that more bifunctional boxBs exist. Our results with P22 N illustrate the complexity of even small RNA-peptide interactions.
While our results relate to the P22 boxB-N peptide interaction, we have recapitulated the interaction in a somewhat heterologous antitermination system that has the added complexity of reflecting the ability of host factors to assemble on the boxB-N complex (as reported by Conant et al. [8a]). Mutations that do not affect N binding can affect antitermination (6,32,37), and heterologous RNA-protein interactions display substantially lower activities than would be predicted by their in vitro affinities (17). Lower absolute activities of non-interactions are consistent with the model where N tryptophan 18 stacks atop the boxB 4-out pentaloop, creating a precise conformation sensed by NusA, which then allows optimal antitermination (49). The P22 N fusion presumably does not permit proper NusA binding, since there is no residue to act as N tryptophan 18 and boxB adopts a 3-out conformation when bound to P22 N (4).
The sequence requirements for N binding may be particularly complex because of the relative absence of base-specific hydrogen bonds. In light of the NMR evidence of extensive contacts between P22 N peptide and the 5Ј backbone of the lower stem (5), even minor conformational changes could drastically affect peptide binding. We speculate that the P22 loop consensus is obscure (besides the G:A pair), because the primary role of loop bases may be to promote a 3-out pentaloop for P22 N binding. Even the extruded C 3 makes only hydrophobic interactions with N protein. This model is similar to that proposed for boxB, in which the extent of peptide binding reflected the proportion of the 4-out pentaloop in unbound boxB (24). The marked preference for U:A and A:U base pairs adjacent to the loop also suggests subtle base-stacking interactions that influence the conformation of the loop and stem. Other examples are known where base pairs influence the stability of proximal loops, such as that of boxB (24) and a related tetraloop (46). Interestingly, U:A and A:U base pairs adjacent to the loop of bovine immunodeficiency virus (BIV) trans-acting responsive element indirectly modulate the binding of the BIV Tat peptide to the stem (41). We speculate that the cytosines of P22 boxB right are tolerated because they are paired and stacked between the loop and the stem, as proposed for the homologous C:C base pair in phage 21 boxB (7). The complex thermodynamics of base stacking would be unlikely to yield a simple consensus.
What determines specificity between and P22 N-boxB interactions? Fluorescence studies support a model where any boxB bound by N adopts the 4-out conformation and any boxB bound by P22 N adopts a 3-out conformation (2). The thermodynamic costs of adoptingor P22-like loop and stem conformations likely dictate N binding, and specificity may be a simple consequence of disfavoring one conformation. Relaxed-specificity RNAs would be able to adopt either conformation, though at some cost to activity.
Neutral theories of evolution highlight the importance of genetic drift, asserting that mutational paths of neutral fitness exist between genotypes of distinct phenotypes (26,34,45). Computational studies of RNA secondary structure strongly support this assertion (21,38,45). However, experimental evidence with biologically active RNAs has been limited. Experimental support for the application of neutral theories to RNA activity has come from only a few examples (18,22,23,40). Our findings that single substitutions connect a P22 N-specific boxB via a moderately bifunctional boxB to a N-specific boxB support the extension of neutral theories from computationally predictable RNA secondary structures to biologically active RNA phenotypes. However, the existence of bifunctional RNAs linking the N-dependent boxBs of P22 and boxBs to those of phage 21 or the structurally unrelated and N-independent RNAs of the HK022 polymerase utilization site (3) seems less likely. Other evolutionary mechanisms exist, such as RNAprotein coevolution and recombination between lambdoid phages (25). Nonetheless, we find that neutral theories apply in this small-model system, where conformation, rather than secondary structure as defined by base pairing, appears to define biological activity.
How common are RNAs that display multiple phenotypes? The intersection theorem of RNA secondary structure states that for any two secondary structures there is at least one sequence that is compatible with both structures (38). The phenotypes of biologically active RNA are typically more complex than the secondary structure, yet genetic drift along neutral paths may lead to new activities. Schultes and Bartel (40) found their intersection sequence able to adopt either of two distinct folds, albeit with severely reduced activity. Riboswitches are observed to undergo changes in secondary structure suggestive of intersection sequences, but few examples of RNAs that can bind different partners with no change in base pairing are known. NMR studies reveal that HIV RRE binds to its cognate peptide HIV Rev and to a selected peptide, RSG1.2, maintaining the same base pairing (50). A mutant hairpin RNA is able to function in either HIV or BIV transcriptional transactivation by binding homologous argininerich peptides in different recognition modes (41). These exam- VOL. 190, 2008 BACTERIOPHAGE P22 boxB CONSENSUS 4269 ples of RNAs able to bind multiple partners suggest that RNAs can serve as intersections of neutral paths between specific phenotypes. Relaxed specificity may require the ability to adopt distinct conformations. Though plasticity comes with lowered thermodynamic stability, evolution presents many examples of conformational flexibility. Induced-fit interactions are commonly observed in RNA-protein interactions and in arginine-rich peptide-RNA interactions in particular (35). Indeed, the arginine-rich domain of N protein is entirely disordered until it is bound to boxB (33), and the boxB loop becomes ordered upon peptide binding (4,24). The phenomenon of induced fit may increase the likelihood of sequences that can adopt more than one functional conformation, whether to signal binding, to interact with multiple partners, or to serve as intersection sequences. RNAs able to bind multiple partners seem more likely to facilitate the neutral coevolution of protein partners.