Previous Article | Next Article ![]()
Journal of Bacteriology, May 2004, p. 2646-2654, Vol. 186, No. 9
0021-9193/04/$08.00+0 DOI: 10.1128/JB.186.9.2646-2654.2004
Copyright © 2004, American Society for Microbiology. All Rights Reserved.
Institut Cavanilles de Biodiversitat i Biologia Evolutiva, Universitat de València, 46071 Valencia, Spain,1 Plant Research International, 6700AA Wageningen, The Netherlands2
Received 26 November 2003/ Accepted 24 January 2004
|
|
|---|
|
|
|---|
3 group of the Proteobacteria (2, 28). After their association, which started at least 150 million years ago, host and symbiont lineages have subsequently diverged strictly in parallel, by maternal transmission of the symbiont to eggs or embryos at blastoderm stage (24). The major role of B. aphidicola in the symbiosis is the provision of amino acids, which are lacking from the phloem sap diet (8). In past years, the discovery of plasmids in B. aphidicola that carry both the rate-limiting genes for biosynthesis of tryptophan (trpEG) and the genes for biosynthesis of leucine (leuABCD) was considered evidence of the overproduction of these essential amino acids, thus supporting the nutrient-provisioning role of B. aphidicola in aphid symbiosis (3, 4, 6, 17, 33, 40, 45, 47, 48). The main B. aphidicola chromosome is also present in multicopy in each cell (15), and in some cases B. aphidicola has fewer leucine and tryptophan plasmid copies than chromosome copies in each cell (30). The discovery that ratios of plasmid-borne trpEG and leuABCD copies to chromosomal gene copies vary, both within and between species (30, 44), casts doubt on the idea that plasmid location is a means of leucine and tryptophan overproduction that leads to a quick response to changes in demand for these amino acids (25). The evolutionary history of the plasmids is puzzling, due to the fact that not all of the lineages of aphids carry plasmids and not all plasmids have the same gene content and/or gene order. B. aphidicola strains associated with aphids of the subfamily Aphidinae and some tribes of the subfamily Pemphiginae contain tryptophan plasmids (17, 33, 47), ranging in size from 3.0 to 12.8 kb, which contain the two first genes of the tryptophan pathway (trpEG). The variability in size is due mainly to variability in the number of tandem repeats of these genes or pseudogenes.
In the case of leucine plasmids, only a single replicon, named repA1, has been found, but the gene content and/or gene order is different in different lineages, indicating a great plasticity of the leucine plasmids throughout B. aphidicola evolution. In fact, up to seven plasmids that are different in both gene order and gene content have been found (45). The first leucine plasmid, pRPE (renamed pBRp), was described for B. aphidicola strains associated with Rhopalosiphum padi (6), a member of the subfamily Aphidinae. It contains the genes encoding key enzymes in the pathway leading to leucine, in the same order as in Escherichia coli (leuABCD). The other genes of the leucine plasmid are two copies of repA, which code for plasmid replicases, and open reading frame 1 (ORF1) (renamed yqhA), encoding a putative integral membrane protein. The same gene content in the same order was found in strains associated with other species of the Aphidinae subfamily (3, 41). The leucine genes have also been located in plasmids in B. aphidicola strains associated with members of the subfamilies Pterocommatinae, Thelaxinae, and Lachninae, with each lineage showing special features (40, 45, 48). Finally, in strains associated with the subfamily Pemphiginae, cryptic plasmids have been found. They are phylogenetically related to the leucine plasmids but do not have the structural leucine genes. These plasmids contain only the origin of replication and one or two copies of the repA gene, plus one or two more genes (ibp or yqhA), depending on the different tribe within the subfamily (Pemphigini, Eriosomatini, or Fordini). It was suggested that they probably represent the ancestral replicon, related to the IncFII plasmids in which the other genes were relocated (48).
During the past 3 years, the whole genomes of three B. aphidicola strains have been completely sequenced (37, 42, 46): B. aphidicola BAp and BSg, associated with the aphids Acyrthosiphon pisum and Schizaphis graminum, respectively, which belong to the same aphid subfamily (Aphidinae) but to different tribes (Macrosiphini and Aphidini, respectively), and B. aphidicola BBp, associated with the aphid Baizongia pistaciae, a member of the subfamily Pemphiginae (tribe Fordini). A comparison of BAp and BSg, with an estimated divergence time of 50 to 70 million years, revealed an extreme conservation of the genome order, with neither chromosomal rearrangements (translocations, inversions, or duplications) nor gene acquisition by horizontal gene transfer, thus being the most extreme case of genome stability to date (42). The comparison with BBp revealed nearly perfect gene order conservation, with only four minor rearrangements (two inversions and two translocations involving the leucine and tryptophan plasmid-carried genes) in the BBp strain. Since the Aphidinae and Pemphiginae lineages diverged about 80 to 150 million years, van Ham et al. (46) suggested that B. aphidicola can be considered a "gene order fossil" and that the onset of genomic stasis coincided with the establishment of the symbiosis. However, the gene contents are different in the three lineages, indicating that independent gene losses have occurred from the last common symbiotic ancestor (LCSA) of B. aphidicola (38).
In the case of BBp, the leucine cluster is located in the chromosome, flanked by the genes yqgF and yggS. However, in a B. aphidicola strain (BPs) associated with the aphid Pemphigus spyrothecae, which is also a member of the Pemphiginae but belongs to a different tribe (Pemphigini), the cluster is also located in the chromosome but is flanked by the genes trxA and rep (34).
In the present work we have characterized the four leucine genes, as well as the flanking regions, that are located in the chromosome in B. aphidicola strains associated with two new species: Tetraneura caerulescens, a member of the tribe Eriosomatini (subfamily Pemphiginae), and Chaitophorus populeti, a member of the tribe Chaitophorini (subfamily Chaitophorinae). These data, together with the two previous leucine cluster chromosomal locations in BBp and BPs, are consistent with four independent insertions of the leucine plasmid throughout B. aphidicola evolution from an ancestral plasmid present in the LCSA.
|
|
|---|
Location of leucine cluster. To determine the location of the leucine genes, either amplified in a plasmid or in the bacterial chromosome, as well as the gene order of the four leucine genes, we followed the procedure outlined by van Ham et al. (48), based on structural PCRs, restriction maps, and hybridization with probes from the pBRp plasmid as described previously (34).
Amplification, cloning, and sequencing of the leucine cluster and flanking regions. A strategy based on overlapping PCR fragments was used to obtain the sequences of the chromosomal regions containing the leucine genes in the two species. All PCR products were purified and cloned into T-pBluescript (19) or into pGEM-TEasy (Promega). Table 1 lists the specific primers used in the amplifications.
|
View this table: [in a new window] |
TABLE 1. Primers used to amplify the leucine cluster and flanking regions in this study
|
![]() View larger version (33K): [in a new window] |
FIG. 1. Gene order and flanking regions of the leucine cluster in B. aphidicola strains. (A) Members of the subfamily Pemphiginae, showing the chromosomal regions with their corresponding cryptic leucine-related repA plasmids. (B) Subfamily Chaitophorinae. (C) Leucine plasmids in pBTs (subfamily Thelaxinae) and BAp (subfamily Aphidinae; plasmids from strains BSg, BDn, and BRp are present the same gene content and order) (see Table 2 for strain designations). Black bars indicate the sequenced regions (reference 34 and this study). Arrows are oriented according to the coding strand. Number in kilobases indicate the position of the gene in the BAp chromosome (37).
|
ZAP II-EcoRI (Stratagene) according to the manufacturer's instructions. Four recombinants of the 1.6- to 1.2-kb partial EcoRI library were in vivo excised to plasmid with helper phage ExAssistant (Stratagene) according to the manufacturer's instructions. Specific primers (TcleuC-Fi, TcleuB-Ri, TcleuD-Ri, and TcleuA-Fi) were designed outwardly to amplify the upstream region of the cluster by ilPCR. The other BTc primers (Table 1) were used to finish sequencing of the region. PCR mixtures contained 40 or 15 pmol of each degenerate or specific primer, respectively; 500 nM deoxynucleoside triphosphates; 1x buffer system 3; and 0.75 µl of Taq polymerase mix in a 50-µl final reaction volume. The amplification profile was 92°C for 2 min; 10 cycles of 92°C for 10 s, 52°C (62°C for iPCR) for 30 s, and 68°C for 1 min (10 min for ilPCR); 20 more cycles with an autoextension of 20 s/cycle at 68°C; and a final extension at 68°C for 7 min. The annealing temperature varied, depending on the primer pair, from 62 to 52°C.
The ilPCR with outwardly oriented primers within leuA lacks 389 nucleotides of the original leuA (48). For completion of the cluster fragment, one leuA degenerate primer (leuA-R2 [Table 1]), was designed and used in PCR in combination with leuD.du2 to obtain a 500-nucleotide fragment containing the missing leuA fragment in both species.
The sequencing of all of the clones (in both directions) was carried out in a PE/ABI 377, 310, or 3100 instrument with a dRhodamine or BigDye version 1.0 dye terminator cycle sequencing kit (Perkin-Elmer). Universal primers T3, T7, UNI17-mer, and UNIrev as well as specific primers were also used.
Computer and phylogenetic analysis. DNA sequence data were assembled with the program Sequencher version 4.0 (Genecodes Co.). Blastx version 2.2.1 (http://www.ncbi.nlm.nih.gov/BLAST) was used to identify the ORFs and for gene assignment.
For comparative analysis we chose representative B. aphidicola strains that had the leucine cluster, either in a plasmid or in the main chromosome, completely sequenced. Table 2 summarizes the main features of the clusters (gene order and location) as well as the GenBank/EMBL nucleotide sequence accession numbers. We classified the aphids as described previously (32).
|
View this table: [in a new window] |
TABLE 2. Taxonomic status, location, and gene order of the leucine cluster in the aphid species (family Aphididae) analyzed in this study.
|
![]() View larger version (25K): [in a new window] |
FIG. 2. Phylogenetic trees obtained with the neighbor-joining algorithm for the leucine cluster (A) and the ibp genes (B). In panel A, representative events according to the proposed scenarios for leucine cluster evolution are shown: (i) (back transfer) LCSA (P), cluster present in an ancestral plasmid, and then four insertions in the chromosome (black arrows); (ii) LCSA (C), cluster present in the ancestral chromosome, and then two transfers into plasmids (white arrows). Abbreviations for B. aphidicola strains and accession numbers are in Table 2. Other -proteobacteria are Escherichia coli K-12 MG1655 (Eco) (accession number U00096), Salmonella enterica subsp. enterica serovar Typhi CT18 (STY) (AL513382), S. enterica subsp. enterica serovar Typhimurium LT2 (STM) (AE006468), Yersinia pestis strain CO92 (Ype) (AL590842), and Vibrio cholerae strain N16961 (Vch) (AE003852). The same species were used in the ibp gene phylogeny, except for V. cholerae and the gene of W. glossinidia that was present in the plasmid pWb1 (NC 003425). Bootstrap values of below 50% were not reported.
|
|
|
|---|
|
View this table: [in a new window] |
TABLE 3. Sizes of the intergenic regions between the four leucine genes and the corresponding flanking genes of the four B. aphidicola strains with the genes in a chromosomal location and two leucine plasmids
|
|
View this table: [in a new window] |
TABLE 4. Sizes of intergenic regions between contiguous genes of the three sequenced B. aphidicola genomes where the leucine cluster has been inserted in the four chromosomal versions
|
Potential ribosome-binding sites. The search for putative ribosome-binding sites upstream of the four leucine genes in the four B. aphidicola strains with the chromosomal location (Table 5) revealed that in four cases this regulatory sequence seems to be absent. The apparent absence of regulatory sequences similar to the eubacterial consensus sequence is a recurrent observation in studies of B. aphidicola DNA (27, 48). Evidence of this is demonstrated by the difficulty of finding a 35 sequence in the promoter of the genes, and it is mainly due to the high A+T content of the B. aphidicola genome (around 75%).
|
View this table: [in a new window] |
TABLE 5. Potential ribosome-binding sites of the four leucine genes in the species that contain the leucine cluster in the chromosome
|
Phylogenetic analysis of leucine and ibp genes. Two different phylogenetic analyses were carried out to learn more about the origin of the different chromosomal locations of the leucine gene and also to asses whether the plasmid versus chromosomal location had any influence on the phylogenetic relationship of the B. aphidicola strains.
Figure 2A shows the phylogenetic reconstruction obtained with the four concatenated leucine genes in the nine B. aphidicola strains (see Table 2 and Materials and Methods), four closely related free-living bacteria, and Vibrio cholerae, a distantly related species that was used as an outgroup. As it can be seen, all of the B. aphidicola strains cluster together, thus corroborating their monophyletic origin. The branch lengths show the evolutionary acceleration that B. aphidicola has undergone compared to its free-living relatives, as has already been stated in several previous works (see, e.g., reference 22). Regarding the relationship of the B. aphidicola strains, there is a clear monophyletic group formed by those associated with the Aphidinae subfamily (bootstrap value, 100). The remaining strains give a group formed by strains associated with the Thelaxinae and Chaitophorinae and, finally, the three strains of the Pemphiginae. This topology agrees with the one proposed by Heie (14), although the low support for some of the branches confirms the difficulty in obtaining a consistent phylogeny of the main B. aphidicola aphid lineages, as already pointed out (20, 29). Regarding the present work, the most relevant aspect of this phylogeny is that the four leucine genes seem to have evolved independently of their position, in either a plasmid or a chromosome, as shown by the cluster formed by BCp and pBTs. The group formed by strains associated with the Chaitophorinae and Thelaxinae has previously been obtained with other genes, such as those encoding GroEL (9).
The genome of Wigglesworthia glossinidia, the primary endosymbiont of the tse-tse fly (1), contains a small plasmid (pWig1) carrying eight genes. The W. glossinidia plasmid does not contain the leucine genes but does contain the gene ibp, the same as has been found in the plasmid pBTs and in the cryptic plasmid pBPs1 (Fig. 1). The similarity between the ibp gene from B. aphidicola and that from W. glossinidia was high (52 to 55%).
The ibp gene encodes a small heat shock protein belonging to the HSP20 gene family. This gene is present in several copies in some gamma-proteobacterial species. For example, Salmonella species contain three different copies, while E. coli contains only two (ibpA and ibpB). An analysis of orthology revealed that neither of the two E. coli genes was an ortholog of the Buchnera ibp genes. In Fig. 2B, a phylogenetic tree with two (ibp and ibpB) of the three paralogous genes is shown. The orthologous ibp group includes the four B. aphidicola ibp genes, located in either the plasmid or the chromosome; the plasmid-located Wigglesworthia gene; and the chromosomally located Salmonella genes STM1251 and STY1871. The close phylogenetic relationship between W. glossinidia and B. aphidicola was expected, according to a recent phylogenetic reconstruction (12). This topology suggests that an ibp gene carried by a plasmid was present in the ancestor of the endosymbiont species and that this gene was transferred to the main chromosome in an ancestor of B. aphidicola associated with the Aphidinae.
|
|
|---|
Finally, taking into account that horizontal gene transfer is a very rare phenomenon in B. aphidicola (but see reference 45), together with the observed synteny, it was possible to infer that gene loss in B. aphidicola is an ongoing process in all of the B. aphidicola lineages. This fact is corroborated by the finding of some B. aphidicola strains associated with the Lachininae subfamily with chromosomal genome sizes of 450 to 470 kb (11). In fact, we can consider the genome of a B. aphidicola strain associated with the aphid Cinara cedri to be the smallest known bacterial genome reported so far (450 kb). The evolution of B. aphidicola would be a case of degenerated, rather than adaptive, genome evolution. Genetic isolation and small effective population size may be main determinants of this degenerative process (22). According to van Ham et al. (46), prolonged genomic stasis could be unsustainable in the long term and could be a symptom of genome degeneracy, despite the strength of compensatory processes such as the stabilizing effect of chaperones on cellular proteins (10). It has also been stated that B. aphidicola was essential to the success of aphids in the initial radiation but is no longer a source of ecological innovation for its host, because the ecological diversification of aphids cannot be attributed to the current genetic diversity of B. aphidicola (26, 42).
However, the results obtained in the present work cannot be explained under the genomic stasis hypothesis. Regarding the leucine genes, up to seven different repA plasmids (6, 40, 45, 48) and four different leucine gene chromosomal positions have been found (references 34 and 45 and the present work).
Since van Ham et al. (48) discovered different locations of the leucine gene cluster, either on the chromosome or on a plasmid, two possible scenarios have been proposed, as follows.
(i) The leucine cluster, probably an operon as in E. coli (50), was located in the chromosome of the B. aphidicola LCSA that predated the symbiosis about 200 million years ago. This bacterium would have carried a cryptic plasmid with at least a repA gene. After establishing symbiosis, the leucine genes were transferred to plasmids independently in several B. aphidicola lineages, resulting in leucine plasmids, with a different gene order and gene content. The minimum number of transfer events would have depended on the phylogenetic relationship used (two in the relationship shown in Fig. 2A). This was the scenario first proposed by our group, for the evolution of both the leucine cluster (48) and the trpEG genes (47). Accordingly, the four different locations of the leucine cluster in the B. aphidicola chromosome would be due to intrachromosomal rearrangements.
(ii) Alternatively, the transfer of the leucine cluster to a repA plasmid took place only once in the common ancestor of B. aphidicola. The different locations of the leucine genes in the B. aphidicola chromosome were due to independent back transfers to the main chromosome throughout B. aphidicola evolution.
The first leucine chromosomal cluster and its flanking genes, from a B. aphidicola strain (BPs) associated with a member of the Pemphiginae, were sequenced by Sabater-Muñoz et al. (34) (Table 2 and Fig. 1). In that work it was postulated that a leucine plasmid was present in the B. aphidicola LCSA that preceded the diversification of all the endosymbionts and that the chromosomal location of the leucine genes observed in some B. aphidicola strains arose by a transfer of such genes from a plasmid to the main chromosome. A three-step back transfer scenario was then postulated, supported by the large sizes of the intergenic regions between leuB and leuA (Table 3) and between the genes ibp and repA1 in the cryptic plasmid (858 bp) of BPs (Fig. 1).
The sequencing of the genome of BBp, another strain associated with the Pemphiginae, showed that the leucine cluster was flanked by different genes (yqgF and yggS) and had the gene order leuBCDA (Fig. 1). A striking fact was that yqgF and yggS were adjacent in BAp and BSg, while the leucine cluster flanking genes trx and rep were contiguous in BBp chromosome (Table 4). These results suggest that the chromosomal positions of the leucine cluster in BBp and BPs were due to two independent insertion events in the ancestral LCSA chromosome, even though the two aphid species belong to the same subfamily.
The sequencing of two new chromosomal leucine clusters carried out in the present work (those of BTc, another strain associated with Pemphiginae from a third tribe, and of BCp, associated with the subfamily Chaitophorinae) also supports the second scenario. Moreover, given the overall data, the four chromosomal locations found in three strains associated with the subfamily Pemphiginae and one associated with the Chaitophorinae can be explained only by four independent insertions (Fig. 2A). The cryptic repA plasmids found in the subfamily Pemphiginae would be the remnant of the ancestral leucine plasmid, as previously postulated (34).
In BTc, the cluster is inserted between the genes truA and yadF (followed by mrcB). In the three sequenced genomes, the gene yadF is absent (Table 4). As mentioned above, a possible explanation is that yadF was present in the LCSA but was convergently lost and disintegrated both in the lineage of B. pistaciae and in the Aphidinae. In the case of the Pemphiginae family, the loss would have taken place after the divergence of the Eriosomatini from the Fordini tribe. The large intergenic region in BBp (249 bp [Table 4]) would indicate that, in fact, this process of disintegration began recently in the Fordini tribe. In the case of BSg and BAp, the small intergenic spaces would indicate that the yadF gene started its disintegration in the ancestor of the Aphidinae subfamily.
In BCp the insertion occurred between the genes gnd and dcd. These two genes are also contiguous in the three sequenced genomes (Table 4), thus indicating that the insertion must have occurred in the lineage leading to the Chaitophorinae. We know that, at least in Chaitophorus leucomelas, another species of the same genus, the leucine cluster is inserted in the same position (data not shown). More data are needed to know whether the back transfer predates the divergence of the Chaitophorinae lineage from the rest of Drepanosiphine group (31).
It has been postulated that the absence of essential genes involved in recombination and repair processes would explain, at least in part, the gene order conservation. We postulate that as recA is absent from the B. aphidicola genome, the possible insertions would have been mediated by the recBCD system that has been retained in the three sequenced genomes and in the B. aphidicola strain associated with C. cedri, which is currently being sequenced (data not shown). Thus, in the absence of recA, recBCD may serve as a general exonuclease repair enzyme functioning as a substitute for recombinational repair.
The ancestral LCSA plasmid may contain the four leucine genes and at least one repA gene, plus two additional genes, ibp and yqhA. The ibp gene is present in two of the seven types of B. aphidicola repA plasmids, and an orthologous gene was detected in the small plasmid present in W. glossinidia (Fig. 2B). The presence of ibp in the chromosome in BAp and BSg probably indicates a back transfer event. A very intriguing fact is that the ibp gene is absent in the BBp genome, while the trpEG genes in the BBp genome are located in the place where the ibp gene is located in the BAp and BSg genomes. The possibility that the trpEG genes also contained in plasmid in some B. aphidicola lineages are in some way related to the repA plasmid is a question that deserves more intense study.
Regarding the yqhA gene, it is present in five of the seven types of B. aphidicola repA plasmids, in strains associated with the subfamilies Aphidinae, Pterocommatinae, Thelaxinae, and Pemphiginae. This wider distribution suggests that the gene may be present in the ancestral plasmid, and it could have been transferred to the main chromosome in the other four types of repA plasmids.
Finally, the reason why some lineages transferred the plasmid back to the main chromosome is an open and unsolved question but is probably related to a nutritional basis (25) and/or to the chromosome and plasmid copy numbers. The presence of the leucine genes in a plasmid in the LCSA was probably advantageous, due to the large number of plasmid copies. However, the loss of genes involved in the control of chromosome replication and segregation led to polyploidy of the bacterial cell. Thus, in many strains, the number of chromosomal copies could be higher than the number of plasmid copies (44). In these circumstances, the presence of the leucine genes in the chromosome could have turned out to be more advantageous for the stability and expression of these genes.
We thank J. M. Michelena and P. González for aphid species identification and F. Barraclough for help with the English. We also acknowledge Servicio de Secuenciación de Ácidos Nucléicos y Proteínas and Servicio de Bioinformática at SCSIE (Universitat de València) for sequencing and bioinformatics support, respectively.
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»