Molecular analysis of a pathogenicity locus in Pseudomonas syringae pv. syringae

One of the chromosomal regions of Pseudomonas syringae pv. syringae encoding pathogenicity factors had been mapped into a 3.9-kilobase-pair fragment in previous studies. Promoter probe analysis indicated the existence of a promoter near one end of the fragment. DNA sequencing of this fragment revealed the existence of a consensus promoter sequence in the region of the promoter activity and two open reading frames (ORFs) downstream. These ORFs, ORF1 and ORF2, encoded putative polypeptides of 40 and 83 kilodaltons, respectively. All ORF1::Tn5 as well as ORF2::Tn5 mutant strains were nonpathogenic on susceptible host bean plants and were unable to elicit hypersensitive reactions on nonhost tobacco plants. The deduced amino acid sequence of the 83-kilodalton polypeptide contained features characteristic of known integral membrane proteins. Fusion of the lacZ gene to ORF2 led to the expression of a hybrid protein inducible in Escherichia coli. The functions of the putative proteins encoded by ORF1 and ORF2 are unknown at present.

Genetic studies of determinants of pathogenicity in phytopathogenic bacteria have been undertaken for a wide variety of organisms (23,32) including Pseudomonas (2,5,8,20,39,41). Isolation of mutants affected in their behavior on plants led to the identification of different types of genes involved in plant-pathogen interactions. The genes required for both the expression of disease symptoms on host plants and the development of the hypersensitive reaction on nonhost plants (16) have been designated hrp genes, whereas the name dsg has been attributed to genes responsible only for disease development (20). The hypersensitive reaction is considered a generalized expression of resistance by the plants to pathogens and is associated with only limited multiplication and spread of the pathogen surrounding the infected area (15). On the other hand, pathogenicity toward the susceptible host is considered to be the ability of the pathogen to establish itself in the plant, resulting in rapid multiplication and widespread invasion by the organism. Hence, characterization of these hrp genes should reveal important aspects of the plant-pathogen interactions.
The Pseudomonas syringae group of phytopathogenic bacteria contains various pathogens that cause diseases on the foliage of plants (12). Among these pathogens is an ecotype of Pseudomonas syringae pv. syringae, which is the causal agent of brown spot disease of Phaseolus vulgaris L., the common bean. Several regions of the bacterial genome that are involved in the pathogenicity of P. syringae pv. syringae R32 were identified in our laboratory by TnS. mutagenesis (2). One of these mutants, PS9021, failed to incite disease symptoms on bean and to cause a hypersensitive reaction on nonhost plants such as tobacco. The mutant did not grow in planta (4) but grew on minimal agar medium, and, unlike the parental strain, which has firm and smooth colonies, it exhibited mucoidal colony morphology. Hence, the mutant appeared to be affected in one or more hrp genes. In this paper we describe the complete DNA sequence of a hrp locus affected in the mutant, transcriptional analysis of the pathogenicity region in the cosmid * Corresponding author.
t Technical paper no. 8625 from the Oregon Agricultural Experiment Station. pOSU3105 (30), and construction of a fusion protein for raising antibodies against a pathogenicity determinant of this hrp locus.

MATERIALS AND METHODS
Bacterial strains and plasmids. Bacterial strains and plasmids used in this study are listed in Table 1.
DNA manipulations. Restriction enzymes, T4 DNA ligase, and exonuclease III were obtained from Bethesda Research Laboratories, Inc., and used as specified by the manufacturer. T4 DNA polymerase (New England BioLabs, Inc.) was used to fill in protruding 3' termini (19) (42) in both orientations to allow sequencing of each strand of DNA. Sequential overlapping deletions of DNA in each clone were generated by the exonuclease III-mung bean nuclease method of Henikoff (13). Each of these new clones was sequenced by the dideoxynucleotide chain termination method (36). Problems arising from compressions in the gels were overcome by substituting 7-deazaguanosine-5'-triphosphate for dGTP (26). The DNA sequence and deduced amino acid sequences were analyzed by using computer programs described by Mount and Conrad (7,27). Immunoblot analysis. Overnight cultures of the strains to be analyzed were diluted 100-fold in L broth with penicillin and grown to an A6. of 0.5 to 0.6. At this time, 2 mM IPTG was added to each culture, and the cultures were grown for an additional 1 h. A 1-ml sample of each culture was microcentrifuged for 1 min, and the pellet was suspended in 200 ,ul of the sample buffer (0.625 M Tris [pH 6.8], 2% sodium dodecyl sulfate, 10% glycerol, 2% 1-mercaptoethanol) and boiled for 3 min. A 10-ml sample of each lysate was subjected to sodium dodecyl sulfate-7% polyacrylamide gel electrophoresis as described by Laemmli (18). After separation, the protein bands were transferred to nitrocellulose filters (pore size, 0.2 ,um; Schleicher & Schuell, Inc.) in a high-field electroblotting apparatus (Bio-Rad Laboratories) as previously described (9). Prestained molecular size standards (Bethesda Research Laboratories) were used to mark migrations of proteins on the blot. The transblotted filter was probed with a 1:10,000 dilution of mouse anti-p-galactosidase antibody (Promega Biotec, Inc.). Immunochemical staining of the bands was performed with an alkaline phosphate-conjugated protoblot system (Promega Biotec) designed for use with mouse antiserum. At present, the hybrid protein is being purified in preparative amounts.

RESULTS
The pathogenicity locus affected in PS9021 is of interest for two principal reasons. First, the mutant is prototrophic and is able to utilize the same range of sugars and nitrogen sources as the wild type (data not shown), but is unable to grow in planta. Second, the mutant is altered in colony morphology, indicating that the gene(s) mutated may be involved in the synthesis of surface-associated or extracel-lular products, which, in turn, may have a role in the recognition essential for bacterial pathogenicity. Previous results (30) revealed a DNA sequence from a cosmid library of P. syringae pv. syringae R32 that complemented all of the altered phenotypes of PS9021. The approximate extent of this hrp locus was determined by site-directed mutagenesis with TnS in E. coli (24), followed by marker exchange of these mutations into the chromosome of PS9020, a strain isogenic to R32. Among the resulting strains was a group of six Hrpmutants, PS3151 to PS3156 (Table 1; Fig. 1), that mapped within a 3.9-kilobase-pair (kb) HindIII fragment. Since the TnS insertions in PS9021 and PS3153 mapped to approximately the same site (24,30), this hrp locus was thought to be affected in the original mutant PS9021. Two proteins of approximately 37 and 85 kilodaltons (kDa) (25) (Fig. 1D) have been expressed from this locus in one direction in E. coli maxicells.
Detection of promoter activity in the locus. Various derivatives of the 3.9-kb HindIII fragment (Fig. 1B) encompassing the hrp locus were generated by using Sall and BglII sites present in the fragment. These smaller fragments were cloned in both orientations in the appropriate polylinker sites upstream of a promoterless galactokinase gene (galK) in the promoter probe plasmids, pKO4 and pKO6 (22). These clones were transformed into the indicator strain, E. coli N100, and tested on MacConkey plates with galactose as the sole carbon source. The clones with promoter activity formed red colonies. Promoter activity was detected in the HindIII-SalI and HindIII-BglII (Fig. 1B) fragments from the right end of the locus, and the direction of transcription was inward (Fig. 1C). To verify that the promoter activity detected in E. coli is truly representative of that in P. syringae pv. syringae, the two HindIII-BglII fragments ( tives of these fragments were individually cloned into a polylinker site upstream of a promoterless chloramphenicol transacetylase gene in the broad-host-range plasmid pIJ3100 (31). The clones obtained were transformed (28) into P. syringae pv. syringae R32 to allow selection for the Smr marker present on pIJ3100, and individual colonies were then tested for the chloramphenicol resistance phenotype to detect any promoter activity present in the cloned fragment. The clones that carried a functional promoter in E. coli also conferred resistance to 15 to 20 jxg of chloramphenicol per ml in P. syringae pv. syringae. These results suggest that promoter activity, functional in both E. coli and P. syringae pv. syringae, resides near the right end of the 3.9-kb fragment.
DNA sequence analysis. The plasmid pOSU3126 (Table 1), used in maxicell studies, contains the 3.9-kb HindIll fragment ( Fig. 1) encompassing the hrp locus cloned into a polylinker site. An internal BglII site is located 2.1 and 1.8 kb away, respectively, from the HindIII sites at the right and left ends of the 3.9-kb fragment (Fig. 1B). A DNA fragment in pOSU3126 that extends from the BamHI site in the polylinker to the internal BglII site 2.1 kb downstream was subcloned in both orientations at the BamHI site of M13mp18. Similarly, a 1.8-kb BamHI-BglII fragment that extends from the BamHI site in the polylinker through the rest of the 3.9-kb fragment was subcloned from pOSU3125 (Table 1) into M13mp18 in both orientations. A 2.7-kb SalI fragment (Fig. 1B) within the 3.9-kb fragment was also cloned in both orientations in M13mpl8 to verify the DNA sequences around the BglII site that was used as an endpoint in previous clones. Unidirectional deletions of the 2.1-, 1.8-, and 2.7-kb fragments were generated from either end of each fragment by the exonuclease III deletion method (see Materials and Methods). Sequential overlapping clones with endpoints separated by ca. 150 base pairs (bp) were used to perform the DNA sequencing of each strand of the fragments. The DNA sequence of the strand containing the promoter activity (sense strand) and the deduced amino acid sequences are shown in Fig. 2.
The sense strand contains two open reading frames (ORFs), ORF1 and ORF2, that code for putative polypeptides of 40 and 83 kDa, respectively (Fig. 1E). These data are in good agreement with the sizes of the polypeptides (ca. 37 and ca. 85 kDa) expressed in E. coli maxicells in earlier studies (25). An E. coli consensus promoter sequence (34) is present upstream of both ORFs (Fig. 2), where promoter activity was detected. However, the -10 region of this promoter overlaps with the first translational initiation codon, ATG, of ORF1, and no promoter activity was detected in the 0.7-kb HindIII fragment present immediately upstream. Initiation of translation at the following in-frame ATG of ORF1 present at nucleotide 405 ( Fig. 2) will yield a putative 28-kDa polypeptide (see Discussion). The translational initiation codon ATG of ORF2 is located 205 nucleotides downstream of the translational stop codon of ORFi1. This ATG is flanked by a 9-bp inverted repeat, and a consensus ribosome-binding site (RBS), GGAGGA (37), is located immediately upstream (Fig. 2). Two DNA sequences identical to 7 and 6 bp of the left repeat sequence are located further upstream between ORF1 and ORF2. A segment of the DNA sequence located 175 nucleotides downstream of the translational termination codon of ORF2 contains all the features of a transcriptional terminator. Specifically, there is an 11-bp inverted repeat that can form a putative stem by base pairing of the mRNA, leaving a 5-base loop with six T nucleotides immediately downstream of the stem-and-loop structure. Hence, the DNA sequence of the 3.9-kb fragment contains the features of a polycistronic operon.
Features of the putative polypeptides encoded by the ORFs. To determine whether the polypeptides encoded by the ORFs share any features of the proteins with known functions, we performed a computer search for amino acid sequence homology with proteins in the GenBank and European Molecular Biology Laboratory data bases. The amino acid sequence deduced from ORF2 did not show homology with any known protein. To further investigate of ORF1. The putative translational initiation site for each ORF is indicated by a solid arrow above the sequence, whereas both the start and stop sites are shown in capital letters in the amino acid sequences. A consensus sequence for a putative RBS upstream of ORF2 is indicated by smaller arrows above the sequence. Conserved regions of a consensus promoter sequence upstream of ORF1 are underlined. A 9-bp inverted repeat sequence flanking the putative initiation codon of ORF2 is indicated by arrows pointing inward. Part of this repeat (GCCGAG), which is reiterated in direct order upstream of ORF2, is identified by a shorter arrow below the sequence. Dotted lines below the sequence downstream of ORF2 indicate the stem of a putative stem-and-loop structure that may act as a transcriptional terminator. Vertical arrows above the sequence indicate the extent of ORF2 used in protein fusion experiments. any feature of the 83-kDa protein shared with known proteins, we plotted the distribution of its hydrophilic and hydrophobic amino acids by using several different computer programs (17). The hydrophobicity plot, which depicts membrane-spanning helices of a protein (10,11), revealed an interesting feature of this protein. The hydrophobicity plots of the 83-kDa polypeptide and a transmembrane protein, rhodopsin (29), are presented in Fig. 3. The existence of nonpolar transbilayer helices in the amino acid sequence of rhodopsiri is shown by hydrophobic peaks (Fig. 3A). The presence of similar membrane-spanning helices in the plot of the 83-kDa polypeptide suggested that it might be a transmembrane protein (Fig. 3B). This plot also indicated that the most hydrophilic region, consisting of six arginine residues An intriguing feature of this hrp locus is the presence of two ORFs on the antisense strand that encode putative polypeptides of 81 and >39 kDa (Fig. 4B). The second ORF on this strand extends beyond the HindlIl site at the right end of the hrp locus, and therefore its putative size was not determined. These ORFs consist mostly of codons complementary to those present in the respective ORFs in the sense strand. The genetic data suggested that the putative products of these ORFs, if expressed, were not involved in pathogenesis. TnS insertions designated 3155 and 3156 (Fig. 4A), which resulted in loss of pathogenicity, should not affect expression of the putative 81-kDa protein, and insertion 3157 in the following ORF (>39 kDa) did not affect pathogenicity. Moreover, expression of any polypeptide from these ORFs has not been established in maxicell studies (25); no detectable promoter activity is present upstream; and examination of codon usage in E. coli (1) reveals that products encoded by the antisense strand, unlike those encoded by the sense strand, will be poorly expressed (data not shown). Hence, the genes encoding the 83-kDa polypeptide, designated hrpM, and possibly the 40-kDa polypeptide appeared to be the major determinants of this pathogenicity locus.
Construction of a chimeric gene from the hrpM locus. Since the original mutant PS9021 had TnS inserted in hrpM, we attempted to overexpress the gene in E. coli by cloning the entire locus downstream of the temperature-inducible A PL promoter and the IPTG-inducible tac promoter in expression vectors pCP3 (33) and pKK223-3 (Pharmacia, Inc.), respectively. No overexpression was observed in Coomassie bluestained gels under induced conditions. To circumvent any problem associated with the expression of the wild-type 83-kDa protein in E. coli, we constructed a chimeric gene by fusing the SacII-BalI fragment (Fig. 4D) of the hrpM gene to codon 8 of the lacZ gene in the appropriate reading frame.
Initially, a pBR322 derivative plasmid, pOSU3125, containing the 3.9-kb HindIII fragment (Fig. 4A) was digested with SacII, and the ends were filled in with T4 DNA polymerase. Synthetic linkers carrying the translational initiation codon ATG as part of a NcoI site (Pharmacia, Inc.) were then ligated to the blunt-ended SacII site to introduce translational start codons. These linkers, of 8, 10, and 12 bp, created ATG start sites in the three possible reading frames at the SacII site of the hrpM gene. The NcoI-HindIII digests of these clones were then ligated to plasmid pKK233-2 (Pharmacia), which was digested similarly. The resulting clones were screened for inserts that contain the BalI site of the 3.9-kb fragment, and clones which contained the 8-, 10-, and 12-bp linkers were designated pOSU4101, pOSU4102, and pOSU4103, respectively. The Ncol site in plasmid pKK233-2 is situated at an optimum distance downstream of a consensus RBS, as well as the inducible tac promoter of E. coli (Fig. 5). Hence, the three clones contained translational start sites that should express truncated proteins upon induction from the three respective reading frames of the hrpM gene. However, if fusions were performed properly and the DNA sequencing data were correct, the ATG in the 12-bp linker was the only initiation codon predicted to be properly aligned with hrpM' and lacZ'. An EcoRI site upstream of the tac promoter in pKK233-2 and a BalI site near the 3' end of hrpM (Fig. 5) were used to release an EcoRI-BalI fragment from each of the three clones that contained all the features of the 5' end of a gene, namely, the promoter, the RBS, and the truncated ORF beginning with an ATG codon. Each of these fragments was separately ligated to the EcoRI-SmaI digest of pMLB1034 (Table 1; Fig.  5), and the resulting plasmids were designated pOSU4104, pOSU4105, and pOSU4106 (Fig. 5). The ligation of BalI and SmaI ends created a fusion of the 3' end of the truncated hrpM gene to codon 8 of the lacZ gene present in the plasmid pMLB1034.
Expression of the fusion protein encoded by the hrpM'-lacZ' hybrid gene in E. coli. Plasmids pOSU4104, pOSU4105, and pOSU4106 were transformed into E. coli JM105. Since JM105 contains an F' episome containing a repressor for the tac promoter, induction of the tac promoter in this strain can be achieved by addition of IPTG. Strains JM105(pOSU 4104), JM105(pOSU4105) and JM105(pOSU4106) were grown to mid-log phase and induced with 2 mM IPTG for 1 h. Equal amounts of cell lysates (see Materials and Methods) from all these strains were subjected to Western immunoblot analysis with antibody raised against ,-galactosidase. A putative polypeptide of 163 kDa was predicted to be encoded by the hybrid gene, and a protein of comparable size was detected only in the lysates of JM105(pOSU4106) (Fig. 6). Since the lacZ' gene was fused to the appropriate reading frame of hrpM' in pOSU4106, expression of the fusion protein in JM105(pOSU4106) lysates provided further verification of the DNA sequencing data. Expression of a truncated ,-galactosidase protein (with 32 amino acids deleted) from the lacZAM15 gene (19) present on the F' episome in JM105 provided a convenient marker protein of 112 kDa, which was detected in each of the lysates in Western blot analysis. The amount of hybrid protein obtained in this induction assay was estimated to be 0.25% of the total protein present in the cell lysates (data not shown). At present, the fusion protein is being purified in preparative amounts for use in raising antibodies against the 83-kDa polypeptide.

DISCUSSION
A pathogenicity locus of P. syringae pv. syringae PS9020 was shown to be contained within a 3.9-kb HindIII fragment. When the DNA sequence of this region was determined and analyzed, two ORFs appeared to be important for the Hrp+ phenotype of PS9020 on bean plants. The TnS insertions in ORF2 resulted in a loss of the pathogenicity, indicating that the 83-kDa protein is a pathogenicity factor. Although the TnS insertion designated 3157 appeared to be within the putative promoter sequence on the basis of a fine-structure restriction map of the TnS insertions, this insertion did not affect the Hrp+ phenotype of PS9020. The reason for such a phenotype is not clear at present, but it is possible that a promoter in TnS is being used to express the downstream ORFs under these conditions (3). Although TnS insertions were not obtained downstream of hrpM to reveal the extent of this pathogenicity locus in that direction, a DNA sequence 88 bp downstream of hrpM was characteristic of a transcription termination site, suggesting that hrpM was the last gene in any possible operon structure. The common features of bacterial termination sequences (35)  repeat, a G+C-rich sequence of 3 to 11 contiguous bases, and a run of 4 to 8 U residues following the G+C-rich region of the mRNA. A DNA sequence containing all these features was detected 175 bp downstream of the hrpM gene (Fig. 2). Moreover, in maxicell studies, expression of the galK gene was severely reduced when an insert containing this sequence was cloned upstream in the proper orientation (25). Thus, any other gene downstream of hrpM would probably belong to a separate genetic unit. Another interesting feature of the 83-kDa polypeptide is a 9-bp inverted repeat that flanks the putative ATG start codon. The left repeat separated the putative RBS from ATG (Fig. 2), suggesting that regulation of hrpM expression may occur at the translational level (43).
The role of the ORFl-encoded product in pathogenicity is as yet undetermined. The TnS insertions in ORF1 led to mutants with the same phenotypes (nonpathogenic and mucoid) as those resulting from insertions within the hrpM gene downstream. These results suggested that the expression of the hrpM product was inhibited by TnS insertions in the upstream ORF. Such inhibition might be the result of the polar effect of TnS insertions in distal genes (3), since no separate promoter was detected upstream of hrpM. Alternatively, these TnS insertions could have inactivated the ORF1 product, which might act as a positive regulator of hrpM gene expression. The first possibility does not necessarily involve the ORF1 product in pathogenicity, whereas the second one implicates ORF1 as a regulatory gene of the hrp locus. A third possibility is that ORF1 is an independent hrp locus. Interestingly, computer analysis of amino acid homology with known proteins in the data bases revealed that the putative 40-kDa protein encoded by ORF1 exhibited 30 to 35% homology over a range of 80 to 130 amino acids with various DNA-binding proteins (data not shown), such as the regulatory histones Hi and H5 (6). Since proteins involved in the regulation of gene expression are DNA-binding proteins in many cases, it is possible that the ORFl-encoded product is involved in regulation of hrpM or other hrp loci. Some ambiguity exists in determining the size of the wild-type protein encoded by ORF1. A protein estimated to be 37 kDa was detected in maxicell studies when an external promoter from the lac operon of E. coli was present upstream (25). In such a case, translation could begin at the first ATG codon of ORF1. However, if the consensus promoter sequence detected upstream of ORF1 is the promoter transcribing both ORF1 and hrpM in the chromosome of P. syringae pv. syringae R32, then the first ATG of ORF1, at nucleotide 28 (Fig. 2), is precluded as a translational start site, since it is part of the -10 region of this promoter. The next ATG codon, located 372 bases downstream, could initiate synthesis of a 27-kDa protein. A protein of that size could have gone undetected in E. coli maxicells, since it would be obscured in the sodium dodecyl sulfate-polyacrylamide gel by a comigrating bla gene product (40). Purification of the ORFl-encoded product and physical mapping of the transcriptional unit will reveal the size of the ORF1 product and the location of the native promoter for this hrp locus.