Previous Article | Next Article ![]()
Journal of Bacteriology, May 2004, p. 2818-2828, Vol. 186, No. 9
0021-9193/04/$08.00+0 DOI: 10.1128/JB.186.9.2818-2828.2004
Copyright © 2004, American Society for Microbiology. All Rights Reserved.
Division of Infectious Diseases, Veterans Affairs Greater Los Angeles Healthcare System, Los Angeles, California 90073,1 Department of Medicine,2 Department of Biomathematics, David Geffen School of Medicine at the University of CaliforniaLos Angeles, Los Angeles, California 90095,3 National Animal Disease Center, Agricultural Research Service, U.S. Department of Agriculture, Ames, Iowa 500104
Received 26 November 2003/ Accepted 28 January 2004
|
|
|---|
|
|
|---|
Several leptospiral OMPs have been described, differing in their degrees of surface exposure. OmpL1 is a porin that has been shown to be surface exposed by immunoelectron microscopy and surface immunoprecipitation (39). Like porins of gram-negative bacteria, OmpL1 is an oligomer and its electrophoretic migration in acrylamide gels is heat modifiable (39). Based upon these findings and beta-moment analysis of the OmpL1 amino acid sequence, a topological model of OmpL1 was proposed with five surface-exposed loops and 10 beta-sheet transmembrane segments (13). In addition to OmpL1, a number of lipoprotein OMPs have been characterized, including LipL21 (6), LipL32 (14), LipL36 (16), and LipL41 (40), designated according to their apparent molecular weights determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. LipL21 and LipL41 appear to be surface exposed, whereas LipL36 is thought to be restricted to the inner leaflet of the outer membrane (6, 40). OmpL1 and LipL41 exhibit synergistic immunoprotection in the hamster model of leptospirosis (17). LipL32 is the leptospiral major OMP and has an immunoprotective effect when hamsters are immunized with LipL32 incorporated into an adenovirus construct (4). The degree to which LipL32 is surface exposed has not been determined. LipL32, LipL41, and OmpL1 are major antigens in the humoral immune response to leptospirosis (10, 12). Recently, a new family of leptospiral lipoproteins was described which contain repeated bacterial immunoglobulin-like domains (28). Expression of two of these Lig (leptospiral immunoglobulin-like) proteins, LigA and LigB, is associated with virulence. The ligC gene, encoding a third member of the family, appears to be a pseudogene in Leptospira interrogans and Leptospira kirschneri. A third class of leptospiral OMP are the LipL45-derived peripheral membrane proteins associated with, but not integrated into, the outer membrane (29, 31).
Comparative analysis of OMP gene sequences can reveal insights into novel mechanisms of molecular evolution in pathogenic bacteria. Previous studies have revealed that the organization of leptospiral genomes is relatively fluid due to various types of recombination events. Comparative pulsed-field gel electrophoresis studies of even closely related strains reveal many large rearrangements (3, 52). In addition, Leptospira species contain repetitive transposase-encoding insertion sequences (IS), some of which may play a role in producing genomic rearrangements (3, 24). The mobility of one of these IS elements, IS1500, is sufficient to allow Southern blot discrimination of 15 different groups within isolates of L. interrogans serovar Pomona type kennewicki (51). Evidence for horizontal transfer of DNA among Leptospira species comes from studies of the intervening sequences found within the 23S rRNA gene (37) and from the finding that the leptospiral lipopolysaccharide biosynthetic locus (rfb) is located in a genomic island that was probably acquired through horizontal transfer from a gram-negative source (24). Comparative sequence analysis of rfb loci has provided information about variability of genes involved in lipopolysaccharide biosynthesis (7). However, relatively little is known about the mechanisms and extent of molecular diversity of leptospiral OMP genes. Given the evidence that OmpL1, LipL41, and LipL32 play a role in protective immunity, the degree of amino acid sequence variation of these OMPs was determined in a large number of strains. This information is useful in making predictions about how broadly protective these immunogens would be and in understanding the frequency of recombination events involving OMP genes.
(Portions of this work were presented at the 102nd Annual Meeting of the American Society for Microbiology, Washington, D.C., 21 to 25 May 2002.)
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. Strains of Leptospira species used in DNA sequence analysis studiesa
|
|
View this table: [in a new window] |
TABLE 2. PCR and sequencing primersa
|
Possible intragenic recombination in ompL1 was examined by using a novel extension to the Bayesian multiple change point (BMCP) model proposed by Suchard et al. (43, 44). This model assumes that the sites along a sequence alignment separate into an unknown number of contiguous segments, each with possibly different evolutionary relationships between the organisms, evolutionary rates, and transition-transversion ratios. Differing evolutionary relationships on either side of a breakpoint suggests recombination (26). One strength of the BMCP model is that it can measure uncertainty in breakpoint locations, determine the most likely parental sequences of the putative recombinant, and assess the statistical significance of recombination simultaneously; this simultaneous approach avoids a statistical pitfall inherent in sequential testing for recombination found in many recombination detection programs (44).
Current implementations of the BMCP model limit analysis to five sequences (33) or assume a known and fixed phylogenetic tree relating a nearly unlimited number of representative parental sequences (44). In this study, these assumptions were relaxed to allow for an a priori unknown relationship between any number of sequences (n). Among n sequences, there exists E = (2n 5)(2n 3)... (1) possible trees. The BMCP model was adapted in two steps. In step one, the original (L) alignment sites for a gene from the n sequences were chopped up into a large number (M) of overlapping contiguous sections, each of length l. The MrBayes program was then employed to estimate the 5 most probable trees for each section, and a list of these trees across all sections was created. Assume this process resulted in a list of P unique trees. In step two, the tree at each site was allowed to equal any one of the P possible trees. Although the total number of allowable tree configurations (PL) is, in general, considerably smaller than EL, the number of total possible configurations in an unrestricted state-space, little bias is expected. Assuming the initial list of P trees is overly diffuse compared to the true posterior distribution over all possible trees in each inferred segment, results from the extended BMCP should well approximate the true posterior distribution.
This extended BMCP model was employed to identify a collection of ompL1 sequences with significant support against recombination and in favor of a single tree describing their evolutionary relationships along the entire gene. Sequences were included in this collection from as many genomospecies as possible. These nonrecombinant sequences with their backbone tree then served as possible parental representatives for the analyses of the putative ompL1 recombinant sequences. For each putative recombinant, breakpoints and parental representatives were inferred and their support in favor of recombination was assessed by using the approach of Suchard et al. (43).
Amino acid sites in ompL1 subject to positive selection were identified by codon substitution models (11) for estimating synonymous and nonsynonymous rates in PAML (48, 49). In particular, models were employed in which the ratio of nonsynonymous to synonymous changes (
) may vary from codon to codon. Statistical significance of positively selected sites was assessed by an empirical Bayes approach under model M8 (48). To control for possible intragenic recombination when performing this analysis, the gene alignment was first partitioned into independent segments based on the most probable breakpoint locations. For each segment, the tree was set equal to that most probable for the segment and tests for positive selection proceeded assuming independence between segments. Results are compared to those obtained without controlling for recombination.
Nucleotide sequence accession number. Previously unpublished gene sequences were submitted to GenBank and assigned accession numbers AY461855 to AY462006.
|
|
|---|
![]() View larger version (20K): [in a new window] |
FIG. 1. Comparison of DNA and amino acid (AA) sequence nonidentity among leptospiral genes. Sequences of the 16S, lipL32, lipL41, and ompL1 genes from 38 leptospiral strains were compared, revealing striking differences in the degree of DNA and amino acid sequence variability between genes. The 16S and lipL32 genes were less variable, and most lipL32 and lipL41 nucleotide mutations were synonymous. In contrast, significantly greater DNA and amino acid sequence variability was observed for the lipL41 and ompL1 genes. Differences in DNA and amino acid sequence variability were significant for all genes (two-tailed t test for paired samples, P < 0.0001). Error bars indicate standard deviations.
|
![]() View larger version (57K): [in a new window] |
FIG. 2. Localization of OmpL1 variable regions. Comparison of the OmpL1 amino acid sequences of 38 leptospiral strains reveals four variable regions corresponding to the four largest surface-exposed loops. (A) Histogram of OmpL1 amino acid sequence variability. Variable amino acids are clustered in four variable regions corresponding to the four largest surface-exposed loops of OmpL1. Variable region 1 (VR1) is located in the first surface-exposed loop (SEL1) and contains the highest number of variable amino acids. Variable region 2 (VR2) is located in the second surface-exposed loop (SEL2), variable region 3 (VR3) is located in the fourth surface-exposed loop (SEL4), and variable region 4 (VR4) is located in the fifth surface-exposed loop (SEL5). The height of each bar in the histogram indicates the degree of amino acid sequence variation from the consensus amino acid residue at that location. The OmpL1 amino acid sequence is numbered from the N-terminal amino acid of the mature protein. (B) Alignments of OmpL1 variable regions. Alignments of the four OmpL1 variable regions are shown by using the consensus amino acid sequences for the horizontally transferred peregrine allele (trans) and each of the six Leptospira species examined in this study: L. interrogans (inter), L. kirschneri (kirsch), L. noguchii (noguc), L. borgpetersenii (borgp), L. santarosai (santa), and L. weilii (weili). Locations of variable amino acids are indicated by dark boxes. The OmpL1 amino acid sequence is numbered from the N-terminal amino acid of the mature protein.
|
![]() View larger version (25K): [in a new window] |
FIG. 3. Phylogenetic trees for the 16S, lipL32, lipL41, and ompL1 genes. These unrooted phylogenetic trees summarize the posterior distribution of the evolutionary relationship among leptospiral strains inferred for the 16S, lipL32, lipL41, and ompL1 genes with MrBayes. Sequences from the same 38 strains were used in each of the four trees. Each strain is represented in each gene tree by a lowercase letter (i, L. interrogans; k, L. kirschneri; n, L. noguchii; s, L. santarosai; b, L. borgpetersenii; w, L. weilii) followed by the 4-digit strain code (detailed descriptions of strains used in this study are given in Table 1). Sequences suggestive of horizontal transfer of DNA (bHB10* in the lipL41 tree and the lowest branch of the ompL1 tree) are indicated by asterisks. Numbers provided above major branches are branch lengths and are reported in the expected number of nucleotide substitutions per site.
|
The inferred lipL41 tree (Fig. 3C) is distinct from those of the 16S and lipL32 sequence trees in two regards. First, in the lipL41 tree, all six species form monophyletic groups, consistent with the greater degree of sequence variation for the lipL41 sequences. Second, the lipL41 sequence of L. borgpetersenii strain HB10 is no longer grouped with L. borgpetersenii (posterior probability, >0.99) and is instead most likely grouped with L. interrogans (posterior probability, >0.92). This suggests that horizontal transfer of the strain HB10 lipL41 sequence may have occurred. The entire lipL41 sequence of L. borgpetersenii strain HB10 is characteristic of L. interrogans lipL41 sequences, indicating that the entire gene was acquired during the putative transformation event.
The inferred ompL1 tree (Fig. 3D) also differs strikingly from the 16S, lipL32, and lipL41 trees in that there is formation of a major new branch of the tree involving 20% (8 of 38) of the sequences (posterior probability, >0.99). This new branch is more closely related to the L. interrogans, L. kirschneri, and L. noguchii cluster than to the L. santarosai, L. borgpetersenii, and L. weilii cluster. Of the 8 ompL1 sequences included in this study that are found in the new branch of the ompL1 tree, five were from L. interrogans strains. The three other sequences found in the new branch of the ompL1 tree were from members of the L. kirschneri, L. noguchii, and L. borgpetersenii species. Separate sequencing studies have also identified an L. weilii strain containing a similar ompL1 gene sequence (data not shown), indicating the widespread distribution of DNA from the new branch of the ompL1 tree.
Given the unexpected finding of a possibly new major lineage represented in only one of the gene trees, the ompL1 sequence alignment was examined for an explanation. The individual site patterns from the different variable regions of the gene were compared. By eye, it was observed that sites from different regions tended to support different phylogenetic trees relating the 8 sequences in the new branch to the remaining major clades. Such spatial phylogenetic variation is suggestive of recombination (42). However, it remained unknown whether recombination between parentals derived from the typed clades resulted in progeny that were artifactually placed along a new major branch when assuming a single tree for the whole gene or whether recombination had occurred between the typed clades and an allele derived from a new lineage. A BMCP model was used to formally test the hypothesis of recombination without assuming known parentals.
Multiple change point model.
BMCP models were used to examine the evidence in favor of recombination, producing the sequences found in the new branch of the ompL1 gene tree. To accomplish this task, 10 representative ompL1 sequences with strong support of being nonrecombinants were identified. These representative sequences consisted of two sequences each from L. interrogans, L. kirschneri, L. noguchii, L. santarosai, and L. borgpetersenii. Chopping up the L (795) original alignment sites for the 10 sequences into M (
80) overlapping segments, each of length l (50) produced a list of P (50) unique trees. Assuming a prior probability of recombination of 0.5, it was found that, with >0.99 posterior probability, no recombination had occurred between the representative sequences, producing a single consistent tree along the entire gene alignment.
Using the 10 representative sequences as possible parentals, the BMCP model was then used to test whether the individual sequences found in the new branch of the ompL1 tree were products of recombination. In all cases, the posterior probability of recombination was >0.99, revealing evidence of multiple recombination events producing the challenge sequences. Three basic mosaic patterns were identified. The most common pattern showed an ompL1 gene with four breakpoints (most probable nucleotide locations, 48, 226, 508 and 707) (Fig. 4), dividing the sequence into five segments (Fig. 5A). The first and third segments were most closely related to ompL1 sequences from L. interrogans strains. The fifth segment aligned with the ompL1 sequences from L. kirschneri, and the second and fourth segments derived from a new lineage. The gene representing this new lineage will be referred to as a peregrine allele, referring to the ability of its DNA to migrate across species boundaries. The five-segment mosaic pattern was virtually identical in all five mosaic ompL1 sequences found in L. interrogans strains and the mosaic ompL1 sequences of L. borgpetersenii strain HB10 (Fig. 4).
![]() View larger version (25K): [in a new window] |
FIG. 4. Most probable breakpoint locations along the ompL1 gene. The posterior distributions of breakpoint locations are plotted for 8 mosaic ompL1 genes. Peaks on the distribution graphs indicate the most probable position of recombination. Three mosaic patterns are shown. The most common mosaic pattern has four breakpoints (shown by shaded bands covering their 95% Bayesian credible intervals). A second pattern, shown for nAS10, has two breakpoints (shown by dark shaded bands covering their 95% Bayesian credible intervals). The third pattern, occurring in kCA02, with three breakpoints, and is a hybrid of the first two patterns. Each strain is represented by a lowercase letter (i, L. interrogans; k, L. kirschneri; n, L. noguchii; b, L. borgpetersenii) followed by the 4-digit strain code (detailed descriptions of strains used in this study are given in Table 1).
|
![]() View larger version (18K): [in a new window] |
FIG. 5. Mosaic ompL1 gene patterns. Three distinct mosaic ompL1 gene patterns were revealed by the BMCP model. The posterior probabilities of various lineages (red, L. interrogans; purple, L. kirschneri; blue, L. noguchii; green, the peregrine allele) giving rise to a region are plotted for the ompL1 genes from L. interrogans serovar Lai (A), L. kirschneri strain CA02 (B), and L. noguchii strain AS10 (C). The peregrine allele occurs in all three mosaic patterns between nucleotides 48 and 226 and between nucleotides 508 and 595.
|
A third pattern was the ompL1 sequence found in L. noguchii strain AS10. This was the simplest pattern, containing only two breakpoints and producing three segments (Fig. 5C). Each of the three segments was derived from a different lineage. The first segment was derived from the peregrine allele, the second segment retained the L. noguchii lineage, and the third segment aligned with ompL1 sequences from L. kirschneri. The first breakpoint was the same as the third site in the four-segment ompL1 mosaic of L. kirschneri strain CA02 (Fig. 4 and 5). However, the second breakpoint was unique to this mosaic pattern (most probable nucleotide location, 771).
Figure 5 shows that two ompL1 regions contain DNA derived from the peregrine allele in each of the three mosaic patterns. The first segment derived from the peregrine allele is located between nucleotides 48 and 226 and corresponds to OmpL1 variable region 1 (Fig. 2A). The second segment derived from the peregrine allele is located between nucleotides 508 and 595 and corresponds to variable region 3 (Fig. 2A).
Not controlling for the effects of recombination can lead to a false-positive inference of positive selection acting on the codons within an alignment (1). Recombination was controlled for when testing for positive selection by first partitioning the ompL1 gene sequences into 5 separate segments. The most probable breakpoint locations for the first mosaic pattern were used to specify the borders between segments. For each segment, possible sites under positive selection were identified and the empirical Bayes probabilities were used to assess significance. Using this process, amino acid positions 260 and 262 (numbered from the start of the mature protein), both in the last segment, were identified as experiencing positive selection (posterior probabilities, >0.99 and 0.86, respectively). For comparison purposes, not controlling for recombination by assuming a single tree across all regions flagged one additional codon at amino acid position 81 (posterior probability, 0.84).
|
|
|---|
The eight strains containing mosaic ompL1 genes belong to four different Leptospira species: L. interrogans, L. kirschneri, L. noguchii, and L. borgpetersenii. Similar mosaic ompL1 genes have been found in an L. weilii strain (Fig. 5). The finding that all five of the mosaic ompL1 sequences from L. interrogans strains were closely related, if not clonal, raises the possibility that the recombination events leading to the five-segment ompL1 mosaic occurred (Fig. 4A) in a single ancestral L. interrogans strain. In addition, the same five-segment ompL1 mosaic gene appears to have been horizontally transferred to L. borgpetersenii serovar Mini strain Sari (HB10), which is also the recipient of an intact (nonmosaicized) L. interrogans lipL41 gene. The variation in breakpoint locations, segment sizes, and lineages in the three ompL1 mosaic patterns indicates that recombination occurred independently in multiple ancestral strains. The L. kirschneri lineage of two of the four segments of the mosaic ompL1 gene of L. kirschneri strain CA02 suggests that the recombination events leading to this four-segment mosaic occurred in this or an ancestral L. kirschneri strain. Likewise, the L. noguchii lineage of the second segment of the three-segment ompL1 mosaic of L. noguchii strain AS10 suggests that the recombination events leading to this mosaic pattern occurred in an L. noguchii strain.
All eight ompL1 mosaic genes and all three of the mosaic patterns observed contain DNA fragments that appear to have been derived from a novel ompL1 allele, acquired by horizontal transfer of DNA. The source of this peregrine allele is uncertain. The mosaic sequences form an in-group in the ompL1 phylogenetic tree (Fig. 3), suggesting a leptospiral pathogen as the source of the peregrine allele. One possibility is that the donor species is a previously undescribed member of the core group of pathogenic Leptospira species. An alternative explanation is that the source is a paralogous ompL1 gene acquired from one of the known Leptospira species. The alternative explanation is less likely because Southern blot studies (13) and two leptospiral genome sequences reveal only a single ompL1 locus per genome (32, 38).
In all three ompL1 mosaic genes, the segments encoding the first and fourth surface-exposed loops are encoded by the peregrine allele. These portions of the ompL1 gene appear to confer a selective advantage on leptospiral strains bearing mosaic ompL1 genes. This is consistent with the finding of increased amino acid sequence variability in the ompL1 gene regions corresponding to surface-exposed loops. The ompL1 gene was examined for evidence of positive selection, and two amino acid sites in the fifth surface-exposed loop were identified (Fig. 2). The lack of more prevalent positive selection, particularly across additional variable loops, suggests that increased sequence variability in certain OmpL1 regions is not a result of immunological pressure. This is consistent with the idea that the renal tubules, the site of leptospiral colonization in the reservoir host, is an immunologically privileged location. The increased amino acid sequence variability may indicate that certain surface-exposed loop variants represent adaptations to the specific host environmental constraints or that amino acid changes are simply better tolerated in the surface-exposed loops.
It is unlikely that the intragenic recombination observed in some ompL1 genes is artifactual. PCR products were directly sequenced without cloning, eliminating the possibility of point mutations introduced through amplification. Both strands of the PCR products were sequenced, reducing the possibility of sequencing errors. Although four different genes were sequenced in each of 38 strains, evidence for intragenic recombination was found only in ompL1 genes. DNA recombination at the time of PCR amplification would be expected to result in a random set of recombination patterns. In contrast, the same pattern of intragenic recombination was found in six of the eight strains. One of the eight strains with evidence of ompL1 intragenic recombination is the L. interrogans serovar Lai genome sequence strain, in which the ompL1 sequence was derived from shotgun cloning of a clonal isolate. Finally, contamination of template DNA is also unlikely because the Leptospira donor species for some of the sequenced DNA has apparently never been isolated.
The studies presented here shed light on the phylogenetic organization of the genus Leptospira. The deeper branches of the L. santarosai and L. weilii monophyletic groups in most of the trees are consistent with the geographic isolation of these species to South America and Southeast Asia, respectively (5). In both the 16S and lipL32 trees, the L. interrogans branch is a clonal branch emerging from within the L. kirschneri tree, suggesting that the L. interrogans species may have evolved from an ancestral L. kirschneri strain. The emergence of the L. interrogans species relatively recently in leptospiral history would likely have been associated with changes in the ecology of its mammalian host. Since many L. interrogans strains are adapted to urban rodents, it is tempting to speculate that the expansion of the L. interrogans branch occurred with the development of agriculture.
Sequencing of the 16S gene is a standard approach for differentiating species in all branches of the phylogenetic tree of life (47). 16S sequences have been examined for the purpose of species differentiation within the genus Leptospira (19, 36) and have also been used as a method of determining the species identity of novel leptospiral strains (15). However, none of the genes examined here are ideal for this purpose when used individually. In the 16S tree, only L. santarosai, L. borgpetersenii, and L. weilii form monophyletic sister groups. In contrast, L. interrogans sequences form a monophyletic in-group within the L. kirschneri branch, and the L. kirschneri and L. interrogans sequences form a monophyletic group within the L. noguchii branch. Although the species of origin of most leptospiral 16S sequences could be readily identified by comparison with a large set of known sequences, certain sequences could be ambiguous. For example, the 16S sequence of L. interrogans strain PO01 would be difficult to distinguish from many L. kirschneri sequences. Due to greater sequence diversity, the lipL32 phylogenetic tree provides greater differentiation of L. noguchii, L. kirschneri, L. interrogans, and L. santarosai sequences, but the L. borgpetersenii and L. weilii sequences are indistinguishable. All species form monophyletic sister groups in the lipL41 and ompL1 phylogenetic trees, but the potential for horizontal transfer of genes invalidates the use of these sequences as an independent means of species identification.
The DNA and amino acid sequences of the four genes examined in this study appear to be evolving at different rates. The 16S genes were the most conserved, followed by the lipL32, lipL41, and ompL1 genes, listed in order of increasing sequence variability. The differences in sequence variability persisted even when only synonymous mutations were considered. Since LipL32 is the most highly expressed protein in pathogenic Leptospira species (14), codon bias is a possible explanation for the higher level of DNA sequence conservation. However, codon adaptation index measurements for ompL1 and lipL41 were higher than for lipL32, indicating that codon bias was not the explanation for the greater degree of sequence conservation among lipL32 genes.
Shuttle vector, gene knockout, and complementation techniques have been developed for the saprophytic organism Leptospira biflexa (34). In contrast, strategies are not yet available for genetic manipulation of pathogenic Leptospira species. The studies presented here indicate that certain leptospiral strains are able to be transformed with foreign DNA and to undergo intragenic recombination. These studies also provide guidance for the development of OMPs as vaccines for leptospirosis. The immunological relevance of the increased amino acid sequence variation of OmpL1 can now be assessed. An OmpL1-based leptospiral vaccine might need to be polyvalent to account for the major OmpL1 variants. On the other hand, the high levels of sequence conservation indicate that monovalent LipL41- and/or LipL32-based vaccines have the potential for being broadly protective.
This work was supported by VA Medical Research funds (to D.A.H.), Public Health Service grant CA-16042 (to M.A.S.), and NIH grants AI-34431 (to D.A.H.) and GM068955 (to M.A.S.).
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»