ABSTRACT
Pyrococcus furiosus and Pyrococcus woesei grow optimally at temperatures near 100°C and were isolated from the same shallow marine volcanic vent system. Hybridization of genomic DNA from P. woesei to a DNA microarray containing all 2,065 open reading frames (ORFs) annotated in the P. furiosus genome, in combination with PCR analysis, indicated that homologs of 105 ORFs present in P. furiosus are absent from the uncharacterized genome of P. woesei. Pulsed-field electrophoresis indicated that the sizes of the two genomes are comparable, and the results were consistent with the hypothesis that P. woesei lacks the 105 ORFs found in P. furiosus. The missing ORFs are present in P. furiosus mainly in clusters. These clusters include one cluster (Mal I, PF1737 to PF1751) involved in maltose metabolism and another cluster (PF0691 to PF0695) whose products are thought to remove toxic reactive nitrogen species. Accordingly, it was found that P. woesei, in contrast to P. furiosus, is unable to utilize maltose as a carbon source for growth, and the growth of P. woesei on starch was inhibited by addition of a nitric oxide generator. In P. furiosus the ORF clusters not present in P. woesei are bracketed by or are in the vicinity of insertion sequences or long clusters of tandem repeats (LCTRs). While the role of LCTRs in lateral gene transfer is not known, the Mal I cluster in P. furiosus is a composite transposon that undergoes replicative transposition. The same locus in P. woesei lacks any evidence of insertion activity, indicating that P. woesei is a sister or even the parent of P. furiosus. P. woesei may have acquired by lateral gene transfer more than 100 ORFs from other organisms living in the same thermophilic environment to produce the type strain of P. furiosus.
The shallow marine volcanic vents of Vulcano Island, Italy, have proven to be a rich source of thermophilic archaea. More than a dozen organisms have been isolated from this location (2, 22, 26, 27, 34, 50, 57, 58, 62, 63, 68), including two species of Pyrococcus, Pyrococcus furiosus (26) and Pyrococcus woesei (68). P. furiosus was the first of these organisms to be discovered in the Vulcano Island ecosystem and is now one of the best studied of the hyperthermophilic archaea. It grows optimally at temperatures near 100°C and utilizes peptides and carbohydrates as carbon and energy sources, generating organic acids and hydrogen or, if elemental sulfur is present, hydrogen sulfide as end products. The physiology of P. woesei appears to be very similar to that of P. furiosus. These organisms have the same growth temperature range and use the same carbon sources and terminal electron acceptors (26, 68). The genome of P. furiosus has been sequenced. It is 1.9 Mb long and contains more than 2,000 open reading frames (ORFs) (51). Although the genome of P. woesei has not been sequenced, 19 protein sequences and two RNA sequences are available in public databases, and all of these sequences exhibit at least 99% identity to their homologs in P. furiosus (5, 13-15, 16-19, 21, 35, 39, 41, 52, 64, 70); this includes the 16S rRNA sequences, which are 100% identical (39).
The question arises as to how these two Pyrococcus species originated and what the evolutionary relationship between them is. This is an intriguing issue given the fact that these organisms are found in the same geothermal environment. Indeed, because of the identity of their 16S rRNA sequences, the striking similarities in their physiological properties, and the disruption of a putative Na+/H+ antiporter gene (napA, PF0275) by an insertion sequence (IS), it was recently concluded that P. woesei should be classified as a subspecies of P. furiosus (39). To evaluate the overall genetic differences between the two organisms, we utilized DNA microarrays based on the complete genome of P. furiosus (55, 56). The fundamental question to be addressed is, does P. woesei contain homologs of all the genes found in P. furiosus? Moreover, if the answer is no, what are the consequences of any differences in terms of evolution, physiology, and metabolism? So far, genome-based DNA microarray comparisons have been restricted to mesophilic bacteria, where the goals were to determine the presence or absence of genes associated with pathogenic and nonpathogenic strains (10, 28, 33, 42, 49, 53, 67). We show here that the results of DNA microarray comparisons allow testable predications to be made about physiology and metabolism. In addition, the whole-genome approach also provides an opportunity to gain insight into interactions between members of the Vulcano Island environment. This may provide a means to assess global genetic exchanges that potentially occurred at a time when primitive archaea lived on a hot earth and acquired or disseminated genetic innovation, such as stress resistance or utilization of a carbon source, for survival.
MATERIALS AND METHODS
Array design and DNA preparation.Microarray slides were designed and processed as previously described (55, 56). P. furiosus DSM 3638 and P. woesei DSMZ 3773 were grown in 1-liter culture bottles using maltose or peptides as the carbon source (65). Cells were harvested at the end of exponential growth, and genomic DNA (gDNA) was isolated by a phenol-chloroform protocol (54).
Preparation of labeled DNA and hybridization conditions.Labeled DNA was prepared using a Prime-It Fluor kit (Stratagene, La Jolla, CA) according to the manufacturer's instructions, except that a dUTP-aminoallyl tag (Sigma, St. Louis, MO) was used. Tagged DNA products and Alexa-labeled DNA were purified using a QIAquick PCR purification kit (QIAGEN, Valencia, CA). Aminoallyl-labeled DNA was coupled with Alexa dyes 488, 546, 594, and 647 (Molecular Probes, Eugene, OR) according to the manufacturer's instructions. Labeled genomic DNA samples were hybridized to a microarray slide containing PCR products for all 2,065 ORFs in the P. furiosus genome (55, 56) using a Genetac hybridization station (Genomic Solutions, Ann Arbor, MI) for 14 h at 65°C. The slides were then washed for 20 s each in 2× sodium chloride-sodium citrate buffer (SSC) containing 0.1% Tween 20, 0.2× SSC containing 0.1% Tween 20, and 0.2× SSC, rinsed in distilled water, and blown dry with compressed air. The fluorescence intensities of the four dyes, which represented one experiment in triplicate with one control, were measured using a Scan Array 5000 slide reader (Perkin-Elmer, Boston, MA) with the appropriate laser and filter settings.
Data analysis.Spots were identified and quantitated using the Gleams software package (Nutec, Houston, TX). The relative fluorescence intensities were averaged from three sets of microarray data generated from slides that contained the P. furiosus genome printed in triplicate (a total of nine arrays). The values for spots that gave negative fluorescence intensities with P. woesei DNA (78 of the 2,065 spots examined) were converted to an arbitrary value of 200 U, and the values for spots that gave negative fluorescence intensities with P. furiosus DNA (16 of 2,065 spots) were converted to the minimum detection limit (2,000 U). Instead of eliminating negative P. furiosus numbers, the conversions were used to keep positive P. woesei hybridizations for the spots even though the P. furiosus control did not. The fluorescence intensities collected from the P. woesei data set were divided into the P. furiosus data set for fluorescence intensities. The resulting values were then multiplied by −1 and added to 100. Values less than 98 were interpreted to indicate that a P. furiosus gene did not have a homolog in the P. woesei genome (corresponding to where the P. woesei fluorescence intensity was 50% of the P. furiosus fluorescence intensity). An ORF analysis was conducted using the InterProScan tool (version 3.3; http://www.ebi.ac.uk/interpro/ ). ISs were analyzed using ISfinder (http://www-is.biotoul.fr/ ). Homologs of P. furiosus ORFs in the genomes of Pyrococcus abyssi (9, 24), Pyrococcus horikoshii (31, 40), and Thermococcus kodakaraensis were defined as ORFs which encoded proteins that exhibited at least 75% sequence similarity over at least 75% of the protein length when the data were analyzed by BLASTP.
PCR and sequencing.Primers were synthesized and PCR products were sequenced by the University of Georgia Integrated Biotech Laboratories (http://www.ors.uga.edu/ibl/index.html/ ). PCR analyses were carried out in triplicate with different annealing temperatures and positive controls. The primers used to amplify the Mal I locus were forward primer 5′-AAT ACG CTC ATA GAA TCA AAG-3′ and reverse primer 5′-CCC TAT GAC TGC CTT TGG ATT-3′. All PCR reagents were obtained from Stratagene, and previously described standard molecular biology techniques were used (54).
Growth studies.The two types of maltose used in the growth studies were 95% grade (M2250; Sigma) and 99% grade (M9171; Sigma). Roussin's black salt (RBS) {Na+[Fe4S3(NO)7]−} (8) was provided by Martin Hughes (King's College, London, United Kingdom). The dried powder was suspended in degassed water to a final concentration of approximately 0.9 μM prior to use.
PFGE.The pulsed-field gel electrophoresis (PFGE) procedure was adapted from the procedure described by Borges et al. (6). Gel plugs were made by suspending cells of P. furiosus or P. woesei to a concentration of 5 × 109 cells/ml in 1% (wt/vol) agarose. The plugs were incubated with 0.1 M EDTA, 1% (wt/vol) cetyltrimethylammonium bromide, 1% (wt/vol) sodium dodecyl sulfate, 1% (vol/vol) Triton X-200, and proteinase-K (2.0 mg/ml) for 24 h at 42°C. The plugs were washed twice with 10 mM Tris buffer (pH 8.0) containing 1 mM EDTA (TE buffer) for 15 min at 4°C, incubated with 1 mM phenylmethylsulfonyl fluoride in 10 mM TE buffer at 23°C for 2 h, washed with 10 mM TE buffer at 23°C for 2 h, and equilibrated with 1× restriction enzyme buffer (New England Biolabs, Ipwich, MA) for 20 min at 4°C. Using fresh 1× restriction enzyme buffer, each plug was incubated with 25 U of NotI (New England Biolabs) for 16 h at 37°C. The plugs were inserted into a 1% (wt/vol) agarose gel and electrophoresed using the CHEF-DR II system (Bio-Rad, Hercules, CA) for 20 h at 200 V with an alternating field (90°/90 s for 1 h and 90° from 1 to 25 s for 19 h). DNA bands were stained with ethidium bromide.
RESULTS
gDNA from P. woesei was hybridized to the DNA microarray containing spots representing the 2,065 ORFs annotated in the P. furiosus genome. The fluorescence intensities were compared directly with those obtained using gDNA from P. furiosus. The results showed that close homologs of 1,890 (92%) of the 2,065 P. furiosus ORFs were present in the P. woesei genome. This included all 21 sequences of P. woesei genes available in the NCBI database (http://www.ncbi.nlm.nih.gov/ ) that are known to be (virtually) identical as determined by direct sequence comparisons. The remaining 175 P. furiosus ORFs (representing 8% of the P. furiosus genome) were not detected in P. woesei DNA at a significant level by microarray analysis, implying that close homologs of these genes are not present in the P. woesei genome. It is possible that some genes are apparently absent because they are highly divergent. However, this seems unlikely given the almost complete identity of all genes (and proteins) examined so far in the two species. The arrangement of the proposed missing ORFs on the P. furiosus genome is shown in Fig. 1. It is readily apparent that one striking feature is that in P. furiosus the missing ORFs form clusters or ORF islands. This pattern is not unlike the previously proposed (25) suggestion that these ORFs form islands or gene cassettes because of functional interactions. If P. woesei is as closely related as suspected and synteny is conserved, then the regions of missing ORFs are restricted to specific areas of the genome.
Hybridization of genomic DNA from P. woesei to the P. furiosus DNA microarray. The data were normalized as described in Materials and Methods. The abscissa indicates P. furiosus ORFs (ORFs 1 to 2065). The bar at the top indicates the positions of P. furiosus insertion sequences.
The veracity of the microarray results was assessed by direct PCR analysis using primer pairs covering 137 of the 175 ORFs proposed by the microarray results (78% coverage). PCR products of the expected size were obtained for all 137 ORFs analyzed using gDNA from P. furiosus, but only 32 ORFs yielded PCR products when P. woesei gDNA was used. This analysis therefore confirmed the absence in the P. woesei genome of homologs of 105 P. furiosus ORFs (or 77% of the ORFs indicated by the DNA microarray analysis). Table 1 lists the P. furiosus ORFs missing from the P. woesei genome. Of the 105 ORFs, 93 were present in 20 gene clusters consisting of two or more genes, and these clusters are indicated in Table 1 (in which shading indicates potential operons). Analysis by InterPro of the amino acid sequences revealed that 37 of the 105 ORFs (35%) encode proteins with unknown functions that have no known homologs in other organisms. Conversely, 407 of the 444 ORFs that are annotated as hypothetical ORFs in P. furiosus appear to be present in the P. woesei genome according to the DNA microarray analysis. This suggests that these hypothetical ORFs in P. furiosus do indeed encode proteins. The results of a BLASTP analysis of the 105 P. furiosus ORFs not present in P. woesei and ORFs of three closely related species, P. abyssi (9, 24), P. horikoshii (31, 40), and T. kodakaraensis (29), are shown in Table 1. Only seven of the P. furiosus ORFs have homologs in the genomes of all three species, and only 11 ORFs have homologs in two of the three organisms. In P. furiosus, these ORFs are scattered in the genome, except for the gene cluster containing PF0764 to PF0770, a homolog of which is found in P. abyssi but not in the other two organisms.
P. furiosus ORFs that lack homologs in the P. woesei genome based on DNA microarray and PCR analysesa
To gain further insight into the differences between the genomes of P. furiosus and P. woesei and the proposed absence of 105 ORFs in the latter organism, a PFGE analysis was performed with DNA isolated from both species after digestion using the NotI restriction enzyme. For P. furiosus DNA, this enzyme should generate six DNA fragments that are approximately 43, 132, 224, 385, 416, and 709 kbp long. Assuming that the two genomes differ only by the 105 ORFs, treatment of P. woesei DNA should also yield six fragments, two of which (42 and 132 kbp) are the same as fragments in P. furiosus. Each of the other four fragments from P. woesei DNA are predicted to be smaller (206, 371, 401, and 667 kbp) than the corresponding fragments from P. furiosus DNA. PFGE analysis revealed the expected six bands from P. furiosus DNA, and six bands were also seen after digestion of P. woesei DNA, all of which corresponded to the P. furiosus DNA fragments (data not shown). Five of the bands appeared to be the same in both species since differences of less than 20 kb were not resolved. However, the sixth fragment (667 kb) from P. woesei DNA was distinguishable from the fragment from P. furiosus DNA (which was predicted to be 41 kb larger). Given that no fragments were obtained from P. woesei DNA that were larger than predicted, we concluded that the genome of this organism is approximately the same size as the P. furiosus genome and lacks all 105 ORFs (equivalent to 88.8 kbp) predicted by the microarray and PCR analyses.
The power of the comparative DNA microarray approach is that it enables predictions regarding metabolism and physiology. Thus, of the ORFs listed in Table 1, of particular interest are the ORFs encoding proteins having known or predicted functions that are amenable to phenotypic analysis. One such gene cluster is the cluster containing PF1737 to PF1751, which includes the Mal I operon found in P. furiosus (1, 55) and in the related genus Thermococcus (50, 66). This operon encodes an ABC-type maltose/trehalose transporter (malEFG and malK, represented by PF1739 to PF1741 and PF1744, respectively), as well as a trehalose-degrading enzyme (PF1742) (1, 32, 36, 37, 44, 66). Interestingly, Thermococcus litoralis, which contains the Mal I operon, was also isolated from a shallow marine volcanic vent at Vulcano Island, Italy. In contrast, as shown by the BLASTP results (Table 1), P. horikoshii, P. abyssi, and T. kodakaraensis do not have a complete Mal I gene cluster. These three organisms were isolated from deep sea hydrothermal environments (4, 24, 31), perhaps implying that the availability of the Mal I gene cluster is limited to the vicinity of Vulcano Island. However, the apparent absence of the Mal I operon from P. woesei is inconsistent with a report that the organism is able to grow on maltose (68). Indeed, in our hands P. woesei exhibited very good growth (densities of >108 cells/ml) when the standard P. furiosus maltose-containing medium was used (65). This inconsistency was resolved by the finding that P. woesei did not exhibit significant growth on high-purity maltose (99%) rather than the technical grade usually used (95%, which contain 5% glucose and polysaccharides) (Fig. 2). Hence, in agreement with the DNA microarray analysis, maltose does not support growth of P. woesei.
Growth of Pyrococcus species on 99% pure maltose. Symbols: ⧫, P. furiosus; ▴, P. woesei.
A second gene cluster of interest in P. furiosus, which was absent in P. woesei, is the cluster containing PF0691 to PF0695. This cluster contains an ORF (PF0694) which encodes a protein that exhibits between 32 and 65% sequence similarity to the flavoprotein nitric oxide reductase (NOR) from the anaerobic bacteria Moorella thermoacetica (59), Desulfovibrio vulgaris (60), and Desulfovibrio gigas (59, 61). The protein encoded by PF0694 has the conserved residues required to coordinate the binuclear nonheme iron site found in NOR (11, 20, 30). Analysis of the genome sequences available for 23 archaea revealed that only 6 of them have a homolog of the gene encoding the bacterial NOR. These organisms include Archaeoglobus fulgidus and the methanogens Methanobacterium thermoautotrophicum, Methanococcus janaschii, Methanosarcina acetivorans, and Methanosarcina mazei. As indicated in Table 1, a close homolog of PF0694, which is annotated by InterPro as NO synthase, is not present in P. abyssi, P. horikoshii, or T. kodakaraensis.
In light of the presence of a putative NOR system in P. furiosus and the apparent absence of this system in P. woesei, the question arose as to whether there are any differences between the two organisms in their responses to reactive nitrogen species (RNS). However, sensitivity of archaea to RNS has not been reported. To investigate the responses of the two Pyrococcus species, we used the NO generator known as RBS. This iron-sulfur-nitrosyl compound delivers seven molar equivalents of NO and is a potent antimicrobial agent (7, 8, 38, 47, 48). If P. woesei does not contain a homolog of PF0694, the organism should be more susceptible to this NO generator than P. furiosus. As shown in Fig. 3, growth studies using RBS showed that P. furiosus was not significantly affected by addition of 0.9 μM RBS, while under the same conditions the P. woesei cultures were not viable. These results strongly suggest that the cluster containing PF0691 to PF0695, particularly PF0694, plays a key role in detoxifying RNS (3).
Growth of P. woesei and P. furiosus in the presence of the NO generator RBS. The arrows indicate times at which RBS (0.9 μM) was added. Symbols: ▴, P. furiosus with RBS; ⧫, P. furiosus without RBS; ▪, P. woesei with RBS; •, P. woesei without RBS.
It is therefore clear that P. woesei and P. furiosus share a close genetic origin. Interestingly, analysis of the P. furiosus genome revealed that the ORF islands not detected in P. woesei are either bracketed by or are close to ISs. This is illustrated in Fig. 1. A notable exception is the putative gene cluster (PF0691 to PF0695) that encodes NOR (Table 1). However, this cluster has an IS on one side (PF0756), and on the other side, adjacent to PF0688, there is a ∼3.5-kb stretch of a long cluster of tandem repeats (LCTR). The tandem repeat that composes the LCTR is 29 nucleotides long. An LCTR is also located next to the cluster containing PF0025 to PF0032, which is another ORF cluster that is absent in P. woesei (Table 1). LCTRs are noncoding repeat sequences in tandem that are believed to behave like mobile elements and that have been proposed to participate in gene transfer (69). The positions of LCTRs next to some of the ORF clusters that are not present in P. woesei strongly support this proposition.
ISs are mobile elements that can transpose within the genome or into extrachromosomal elements (12, 46). The Mal I gene cluster found in P. furiosus (but not in P. woesei) is particularly noteworthy as it is packaged as a composite transposon. In other words, the Mal I is bracketed by two identical ISs, and the whole composite transposon (including ISs) is flanked by matching direct repeats (DRs), an indication of insertion as a complete composite transposon. In fact, the sequences of the Mal I gene cluster are virtually identical in P. furiosus and T. litoralis, and it has been proposed that a lateral gene transfer event was responsible (23, 37). To investigate whether the nature of IS and DR elements around the Mal I gene cluster could provide insight into the phylogenetic relationship between P. furiosus and P. woesei, the sequence of the relevant region in P. woesei was determined. PCR primers were designed to anneal outside the vicinity of the composite Mal I transposon in P. furiosus to determine what was present in the corresponding region in P. woesei. The same experiment with P. furiosus produced a PCR product that was approximately 17 kb long (the Mal I composite transposon is 17,854 bases long), while P. woesei produced a fragment that was about 780 bp long (Fig. 4). The sequence of the PCR product from P. woesei (accession number DQ202294 ) revealed that synteny of surrounding ORFs was conserved in P. woesei and P. furiosus. An eight-nucleotide sequence, CAGGAGGA, was found in the P. woesei locus where the Mal I cluster is located in P. furiosus. This sequence is not spurious, as the DRs that bracket the P. furious Mal I composite transposon have the same sequence (Fig. 5).
Products of the Mal I region of P. furiosus and P. woesei as determined by PCR analysis. The products were analyzed on an agarose gel (0.5%, wt/vol) and were stained with ethidium bromide. The λ/HindIII-φX174/HaeIII DNA ladder was obtained from Stratagene.
Sequences of the PCR products of the Mal I regions of P. furiosus and P. woesei. See text for details. The sequenced ORFs of the unsequenced P. woesei genome are labeled ‘PF1735’ and ‘PF1753.’ The nucleotide sequences of these two ORFs in P. woesei are identical to the nucleotide sequences of the corresponding P. furiosus ORFs.
DISCUSSION
Insertion sequences are suspected of playing key roles in the shaping of the P. furiosus genome that led to evolutionary divergence from P. abyssi and P. horikoshii (43, 45, 69). A total of 28 transposases are annotated in the P. furiosus genome, and our analysis of them showed that they comprise four groups. These groups were given the formal names groups ISPfu1 (8 isoforms, IS6 family), ISPfu2 (11 isoforms, IS6 family), ISPfu3 (5 isoforms, IS982 family), and ISPfu5 (4 isoforms, IS6 family) after analysis by ISFinder (http://www-is.biotoul.fr/ ). The ISs that bracket the Mal I gene cluster are isoforms belonging to group ISPfu1. This group is a member of the IS6 family, which transpose via replicative transposition. This is accomplished by formation of cointegrates between the donor and target sites that resolve, leaving a copy of the IS in the target and the donor site (46). Once a member of the IS6 family inserts into a locus, the copy remains at the new location, potentially replicating at other target sites in time. However, the PF1735/PF1753 locus of P. woesei does not contain an IS or a composite transposon, nor is there any indication that the locus has been involved in replicative transposition. Assuming that the archaeal ISs follow the observed IS6 replicative progression, we concluded that P. woesei is a sister and possibly the parent of P. furiosus. A P. furiosus strain would therefore be the type strain and presumably acquired ORFs (missing from P. woesei) from external sources at temperatures near 100°C in the same hydrothermal environment.
ISs likely play a pivotal role in shaping the genomes of the thermophilic community found at Vulcano Island and at similar locations on Earth. The use of DNA microarrays enables the first step to be taken toward understanding genomic phenomena such as the dissemination of gene cassettes within a community. This focuses attention on gene clusters found in archaeal communities rather than on individual genes or an individual organism. However, as demonstrated here, additional molecular and phenotypic characterizations are necessary to confirm the implications of array results.
ACKNOWLEDGMENTS
We thank Martin Hughes for his generous gift of Roussin's black salt, Patricia Siguier for assistance with the IS analysis, Bryan Gibson, Jon Voigt, and Melani Atmodjo for assistance with the growth studies, Farris Poole for the annotation analyses, Anna Glasgow Karls and Jim Holden for helpful discussions, and Bao Phan for assistance with PFGE.
This research was supported by grants BES-0317911 and MCB 0129841 from the National Science Foundation.
FOOTNOTES
- Received 12 April 2005.
- Accepted 9 August 2005.
- Copyright © 2005 American Society for Microbiology