Previous Article | Next Article ![]()
Journal of Bacteriology, February 2007, p. 818-832, Vol. 189, No. 3
0021-9193/07/$08.00+0 doi:10.1128/JB.01180-06
Bioscience,1 Computing, Computational, and Statistical Sciences,3 Theoretical Divisions, Los Alamos National Laboratory, Los Alamos, New Mexico 87545,4 Integrated Toxicology Division, United States Army Medical Institute of Infectious Diseases, Fort Detrick, Maryland 21702,2 Department of Food Microbiology and Toxicology, Food Research Institute, University of Wisconsin, Madison, Wisconsin 53706,5 Defense Biology Division, Lawrence Livermore National Laboratory, Livermore, California 94551,6 Department of Anesthesia and Pharmaceutical Chemistry, University of California, San Francisco, Rm. 3C-38, San Francisco General Hospital, 1001 Potrero Ave., San Francisco, California 941107
Received 31 July 2006/ Accepted 21 October 2006
|
|
|---|
|
|
|---|
BoNTs are classified by the Centers for Disease Control and Prevention (CDC) as one of the six highest-risk threat agents for bioterrorism (the "category A agents") due to their extreme potency and lethality, the ease of production and transport, and the need for prolonged hospital intensive care for those exposed (1). Multiple countries have produced BoNT for use as weapons (5, 45), and the Japanese cult Aum Shinrikyo attempted to use BoNT for bioterrorism (1). Since the terrorist events of 11 September 2001 and the subsequent intentional release of anthrax spores, the development of environmental toxin sensors, diagnostic tests for botulism, and specific countermeasures for the prevention and treatment of intoxication have become a high priority. The first step in such research is to define the spectrum of diversity of BoNT-producing clostridial species and the toxins that they produce.
C. botulinum strains are usually described as belonging to one of four different groups (groups I, II, III, and IV) based on physiologic characteristics (18, 38). The toxins produced are categorized into seven serologically distinct groups (serotypes A through G), based on recognition by polyclonal serum (17). Each BoNT is encoded by an approximately 3.8-kb gene, which is preceded by a nontoxic nonhemagglutinin gene and several other genes that encode toxin-associated proteins (HA-17, HA-33, HA-70, p21, and/or p47) (3, 8, 11, 12, 34). The BoNT gene for strains of serotypes A, B, E, and F can be found within the bacterial chromosome. Serotype C and D strains produce toxin from a phage genome, and serotype G strains contain a plasmid containing the toxin operon (34). Strains producing interserotype recombinant toxins, primarily the C/D and D/C phage-encoded serotypes, have been reported (31, 32). Several strains produce multiple toxins. Bivalent C. botulinum strains, each producing two toxins of serotypes Ab, Ba, Af, and Bf, have been reported (4, 15, 37).
The genomic background containing these BoNT genes within C. botulinum has been characterized as being very diverse. Moreover, other species are known to harbor BoNT genes, such as Clostridium butyricum (BoNT/E) (2, 30), Clostridium baratii (BoNT/F) (16), and Clostridium argentinense (BoNT/G) (43). Previous 16S rRNA gene analysis of many different Clostridium species has shown that C. botulinum strains form four distinct clusters, with each cluster representing one of the four different physiological groups (groups I to IV) (8, 22). Previous amplified fragment length polymorphism (AFLP) analysis of 70 C. botulinum BoNT/A, B, E, and F strains showed that this technique could also successfully differentiate strains into the distinct group I and group II clusters (25). Like the 16S rRNA gene analysis, the AFLP results show that the group I cluster included BoNT/A, B, and F proteolytic strains, while group II contained BoNT/E and nonproteolytic B and F strains (25). Thus, the phylogeny of these species based on molecular analyses has supported the current taxonomy, which has been based on the physiologic attributes of the species and the toxins produced. Such analyses have contributed to the understanding of the diversity of the genomic backgrounds that contain the very different BoNT genes.
Recently, it has become evident that there is significant sequence diversity (subtypes) within the BoNT genes and toxins of at least six of the seven serotypes (39). The relationship between toxin gene diversity and clostridial genomic diversity is unknown. Such subtypes can differ by 2.6% to 31.6% at the amino acid level, and these differences can affect the binding and neutralization by monoclonal and polyclonal antibodies (13, 28, 39). Since an analysis of only 48 published full-length toxin gene sequences revealed the presence of 18 different subtypes, it is likely that additional subtypes might exist (39). Defining the extent of such toxin diversity is a first step in the development of detection systems and countermeasures for the prevention and treatment of botulism (33, 39). In addition, analysis of a large population of strains can be used to better understand the evolutionary relationship between the toxin moieties and the genomic backgrounds that contain these toxins.
To better understand the extent of toxin gene diversity and the relationship between genomic diversity among C. botulinum serotypes and subtypes and other toxin-producing species of Clostridium, 174 toxin-producing strains from a collection that included representatives of all neurotoxin serotypes (BoNT/A to BoNT/G) were analyzed. Several methods were used to examine the strains, including sequencing of the 16S rRNA gene, analysis of the genome by AFLP, and sequencing of BoNT/A, B, and E neurotoxin genes. Nucleotide sequences of the 16S rRNA and BoNT genes from these and other previously sequenced Clostridium strains were analyzed by phylogenetic and recombination detection methods. The phylogenetic relationships among these strains based on all of these methods as well as the extent of toxin gene diversity and the relationship between toxin types, subtypes, and genomic differences are presented.
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. C. botulinum strains analyzed
|
AFLP analysis of DNA samples. The selection of the two restriction endonucleases to digest the genomic DNAs and the single nucleotide added for subsequent selective amplification of the resulting fragments was based on the low G+C content (28.2%) of the C. botulinum genome (http://www.sanger.ac.uk/Projects/C_botulinum/). EcoRI and MseI, with recognition sites of GAATTC and TTAA, respectively, were used to digest 100 ng of DNA from each sample. The resulting fragments were ligated into double-stranded adapters. The digested and ligated DNA was then amplified by PCR using EcoRI and MseI +0/+0 primers (5'-GTAGACTGCGTACCAATTC-3' and 5'-GACGATGAGTCCTGAGTAA-3', respectively). Five microliters of each product was used as a template in subsequent selective amplifications using the +1/+1 primer combination of 6-carboxyfluorescein-labeled EcoRI-T (5'-GTAGACTGCGTACCAATTCT-3') and MseI-T (5'-GACGATGAGTCCTGAGTAAT-3') (underlining shows difference from the +0/+0 primers). Selective amplifications were performed in 20-µl reaction mixtures. The resulting products (0.5 to 1.0 µl) were mixed with a solution containing DNA size standards, Genescan-500 (Applied Biosystems Inc., Foster City, CA) labeled with N,N,N,N-tetramethyl-6-carboxyrhodamine. Following a 5-min heat denaturation step at 95°C, the products were loaded onto an ABI 3100 automated fluorescent sequencer. Each set of AFLP reaction mixtures also contained a control DNA as a template. Inclusion of such a reaction mixture in each run-and-analysis set allowed a comparison of results from previously archived analysis sets that were run at different times. Genescan analysis software (Applied Biosystems, Inc., Foster City, CA) was used to determine the lengths of the sample fragments by comparison to the DNA fragment length size standards included with each sample. To minimize capillary gel electrophoresis artifacts, each labeling reaction product was run in triplicate. Samples were loaded into a 96-well plate in a random order.
AFLP data analysis was performed as described previously by Ticknor et al. (45). Sample fragments between 100 and 500 bp and with fluorescence above 50 arbitrary units in all three runs on the ABI sequencer were used in the analysis. Similarities among samples were determined using three separate methods to allow comparisons between methods. First, the Jaccard coefficient, which compares the presence and absence of fragments of a given length, was used. Second, Euclidean distance with the relative abundance values was used, so that both presence and abundance are compared. Third, a Manhattan distance was used, which is similar to Euclidean distances except that the absolute value instead of the squared value is reported. The 40 tallest peaks for each sample fingerprint were used to calculate the distance coefficients among samples. Dendrograms were produced using each of the three similarity matrices using the unweighted-pair group average agglomerative hierarchical clustering method (24). All statistical data manipulations were done using codes developed using S-Plus (Data Analysis Products Division, MathSoft, Seattle, WA). The dendrograms using the Euclidian and Manhattan distances, which include relative fragment abundance values, were compared to the Jaccard distance dendrogram, and there were no differences in the groupings. This shows that these groupings are robust and are not artifacts of the data analysis methods. The dendrogram using the Jaccard distances is presented. Replicates have Jaccard distance measures at the 0.20 level or below. No differences below the 0.20 level on the Jaccard dendrogram are presented since it cannot be determined if the differences are due to variability in the assay or actual sample differences.
16S rRNA gene sequencing of C. botulinum samples. Representatives of the different BoNT-producing Clostridium strains were selected for 16S rRNA gene sequencing. Primers 1492R (5'-GGTTACCTTGTTACGACTT-3') and 27F (5'-AGAGTTTGATCMTGGCTCAG-3') were used to PCR amplify approximately 1,400 bases of this 1.5-kb gene. The purified PCR template was then sequenced using these primers and internal primers 533Fb (5'-GCCAGCAGCNGCGGTAA-3'), 940Fb (5'-CGGGGGYCCGCACAAGC-3'), and 910Rb (5'-GCCCCCGTCAATTYHTTTGAG-3'). The 16S rRNA gene phylogenetic dendrogram was created from an alignment of 16S rRNA gene sequences, some new to this study and others obtained from GenBank entries of previously sequenced genes. It should be noted that Clostridium genomes each contain more than one copy or allele of the 16S rRNA gene. After multiple sequence alignment with MUSCLE (http://www.drive5.com/muscle/), columns in the alignment in which more than 80% of the sequences were represented by a gap character were removed, leaving an alignment of 1,329 bases for phylogenetic analysis. The phylogenetic dendrogram was calculated using PHYLIP dnadist and neighbor programs with the F84 model of evolution and four sequences from the genus Alkaliphilus (GenBank accession numbers AY554415, AB037677, AF467248, and AJ630291) to serve as the outgroup to the Clostridium genus sequences. The resulting tree was rendered with TreeTool (http://packages.debian.org/unstable/science/TreeTool/), and the outgroup was removed to produce the final figure.
BoNT gene PCR amplification and sequencing. Overlapping primer pairs covering the coding sequence of the different BoNT genes were designed for PCR amplification using available GenBank sequences. Internal DNA oligomers were also designed within each amplicon to provide confirming sequence data in both directions. These PCR amplification and sequencing primers for each of the neurotoxin gene fragments are listed in Table 2. The initial PCR mixture contained 10 mM Tris-HCl (pH 8.3), 50 mM KCl, 1.5 mM MgCl2, 0.001% (wt/vol) gelatin, 0.2 mM each deoxynucleotide triphosphate, 20 pmol of each primer, 2.5 U of Amplitaq DNA polymerase (Perkin-Elmer, Inc., Boston, MA), and approximately 1 ng template DNA in a 100-µl total reaction volume. Template DNA was initially denatured by heating at 94°C for 2 min. This was followed by 35 cycles of denaturation at 94°C for 1 min, annealing at 55°C for 1 min, and primer extension at 72°C for 1 min. Incubation for 5 min at 72°C followed to complete the extension. PCR amplicons were analyzed by electrophoresis through a 3.0% agarose gel dissolved in a solution containing 10 mM Tris-borate (pH 8.3) and 1 mM EDTA for 1 h at 80 V. Gels were stained for 20 min with a solution containing 1 µg of ethidium bromide/ml, destained in distilled water, and then visualized and photographed under UV light. PCR amplicons were purified using a QIAGEN PCR purification kit (QIAGEN Inc., Valencia, CA) and then sequenced using ABI Dye Terminator 3.1 chemistry with an ABI 3730 instrument.
|
View this table: [in a new window] |
TABLE 2. Primers used for PCR amplification and sequencing of BoNT/A, B, and E genes
|
![]() View larger version (20K): [in a new window] |
FIG. 4. Similarity plot comparing BoNT subtype sequences to the BoNT/A2 subtype. BoNT sequences of the BoNT/A1, A3, and A4 subtypes and BoNT/B1 and Chinese C. butyricum BoNT/E were compared to the BoNT sequence of the BoNT/A2 Kyoto-F subtype (GenBank accession number X73423). This plot illustrates that the BoNT/A2 subtype is approximately 99% identical to the BoNT/A1 subtype (A142) through nucleotides 1 to 1146 and approximately 99% identical to the BoNT/A3 subtype (A254) through nucleotides 1147 to 3450. This suggests that the BoNT/A2 subtype is a result of a recombination event between BoNT/A1 and BoNT/A3 lineages of gene sequences.
|
|
|
|---|
16S rRNA gene analysis. Comparative analysis of the nucleotide sequences of the conserved 16S rRNA gene using a subset of 109 strains representing the different serotypes in this collection and those from other Clostridium species illustrates that the genetic distances between the toxin-producing groups are typical of the distances between other species within this genus. Figure 1 illustrates that the toxin-producing clostridia are comprised of the four previously characterized distinct phylogenetic clusters, which would logically be defined as discrete species within the Clostridium genus. These results confirm and extend previous work of many others (8, 21, 22, 42). The majority of the strains in this collection are tightly clustered with 16S profiles that are identical to or nearly identical to many sequences that have previously been designated as belonging to the proteolytic group I C. botulinum strains that contain all of the A and most of the B and F neurotoxin genes (22). The remaining strains form phylogenetically distant species clusters that define the remaining physiological groups, groups II, III, and IV (21). The group II strains include all of the neurotoxin E strains, nonproteolytic B strains, and nonproteolytic F strains and are most closely related to Clostridium beijerinckii and C. butyricum. Group III (closely related to Clostridium novyi) strains encode BoNT/C and D and C/D recombinant and D/C recombinant serotypes encoded by toxin operons on a bacteriophage. The 16S sequences of group IV strains are nearly identical to those of Clostridium subterminale and C. argentinense and belong to BoNT serotype G, which is encoded by a plasmid.
![]() View larger version (25K): [in a new window] |
FIG. 1. Phylogenetic dendrogram of Clostridium species based on 16S rRNA genes. A neighbor-joining tree of 54 sequences reported in GenBank and 36 sequences representative of the strains from this collection is shown. This illustrates the genetic diversity within the clostridia. C. botulinum strains cluster into four distinct groups that follow the group I to group IV designation historically based on physiological characteristics. These groups are interspersed among the 27 other clostridial species in the tree. The tree was constructed using an alignment of 16S rRNA gene sequences that contained 1,329 bases after removal of columns containing more than 80% gap characters and includes sequences from bivalent, nonproteolytic, and proteolytic toxin-producing strains.
|
![]() View larger version (24K): [in a new window] |
FIG. 2. AFLP-based dendrogram of 174 C. botulinum strains. DNA fragments generated from restriction endonuclease digestion of each of the strain DNAs were ligated into linkers and selectively amplified. Forty DNA fragments generated by AFLP experiments were used as a fingerprint to represent each of the strains. If 40 fragments did not exist, fewer fragments were used, as noted in parentheses. The comparison of fingerprints from the 174 strains shows a large separation between the proteolytic (group I) and nonproteolytic (groups II, III, and IV) strains and distinct branches representing groups I to IV. The AFLP groups also contain generally distinct toxin serotypes. The distance measure or genetic distance is the proportion of fragments that two samples do not have in common.
|
The AFLP analysis subdivides the group I proteolytic BoNT/B strains into smaller clusters, which include the serologically distinct BoNT/B1- and BoNT/B2-producing strains (27) and four bivalent BoNT/B-producing strains (Ba207, Ab149, Bf698, and Bf258), all of which produce bivalent BoNT/B toxins. The most common BoNT/B subtype represented here is the BoNT/B2 subtype. The BoNT/B1 strains are more likely to be of U.S. origin and associated with food-borne cases due to improperly processed vegetables, while the BoNT/B2 strains are mostly from Europe and associated with animal cases or meat. The original BoNT/B2 strain was isolated from a case of infant botulism in Japan, and two recently published sequences from BoNT/B strains isolated from Korean soil (GenBank accession numbers DQ417353 and DQ417354) are also from BoNT/B2 strains. The BoNT/A2 (Ab149) and BoNT/A3 (A254) subtype strains and proteolytic BoNT/F strains cluster in separate branches within the BoNT/B strains. The five proteolytic BoNT/F strains cluster together and are distinct from the other BoNT/B strains. These branches reveal genetic similarities of proteolytic BoNT/B strains with both BoNT/A subtypes and proteolytic BoNT/F-producing strains and support the group I designation for all of these strains. The close relationship among the BoNT/B- and BoNT/F-producing strains is also observed in the group II area of the AFLP dendrogram, where three nonproteolytic BoNT/B-producing strains (B160, B257, and B697) cluster and are most closely related to a nonproteolytic BoNT/F-producing strain (F550).
In addition, four bivalent strains of serotypes Ab (Ab149), Ba (Ba207), and Bf (Bf698 and Bf258) included in this study cluster together at the 0.2 level in this portion of the AFLP dendrogram. The genetic backgrounds of these four strains cannot be distinguished by AFLP analysis, and their 16S rRNA genes were found to be more than 99.93% identical to one another. By comparison, A150 and Bf258, separated by AFLP analysis, contained 16S rRNA gene sequences that were 99.78% identical to each other. These bivalent strains with similar genetic backgrounds each contain combinations of the different toxin genes BoNT/A, B, and F expressed at different levels. This finding appears to indicate very recent horizontal transfer of these toxin genes into the same bacterial lineage. All these strains were isolated from cases of infant botulism in different geographic locations: Sweden, Texas, New Mexico, and Utah.
Group II C. botulinum BoNT/E-producing strains, which are usually associated with fish and marine mammals, appear within their own branch of the AFLP dendrogram. The 21 BoNT/E-producing strains include samples from salmon, whale, and soil from the Olympic National Forest. The placement of these group II BoNT/E strains within a distinct branch of the AFLP dendrogram reflects the genetic background of these strains that have evolved to include different hosts and environmental habitats occupied by this serotype. The only C. butyricum strain (E543) containing a BoNT/E gene in this study is distant from these other 20 C. botulinum type E strains and was isolated from an infant botulism case in Italy (30). A small branch within the BoNT/E-producing strains includes three isolates (E213, E538, and E542) whose differences are below the replicate variability in this AFLP analysis. These three isolates of the "Beluga" strain, which were received from two different research collections (USAMRIID and Virginia Polytechnic Institute), were intentionally included in these experiments. These Beluga isolates are indistinguishable and add confidence to the results obtained using strains collected by different investigators over many years.
The majority of the group III BoNT/C (17/19) and BoNT/D (6/6) serotypes form a distinct branch in the AFLP dendrogram. These group III strains form several clusters containing BoNT/C strains or combinations of BoNT/C and D serotypes that are not distinguishable by this method. One cluster contains eight BoNT/C strains (C167, C174, C210, C522, C523, C530, C532, and C659), seven of which are from Western Europe. These strains are linked to disease in mammals. Another cluster of five strains, shown to be C/D strains, were collected from marine or freshwater sediments. Three of the strains (C525, C526, and C527) are from marine sediments in the United States. Strain C209, which differs slightly from them, is from Japan. Other group III strains are from the United States (C529), Japan (D701), South America (C700), and Africa (C524, C699, and D535). Two of these isolates, C523 and C659, were identical strains that were intentionally included in this study, and the results show that these two strains cluster at the 0.2 level by AFLP analysis. A distant branch contains a BoNT/C strain (C531) and a BoNT/C/D strain (C528), which shows these two strains to be most similar to the BoNT/G serotypes.
The final cluster includes all seven BoNT/G strains in the AFLP dendrogram. This plasmid-encoded toxin gene was first identified in isolates from soil in Argentina (14). Two of the strains in this study are from Argentinean soil (G190 and G194), and the other five strains are from human autopsy specimens in Switzerland (40, 41). These seven samples from different sources show genetic similarity and cluster at the 0.25 level in this portion of the AFLP dendrogram. Four of the five autopsy specimens cluster together with one of the soil isolates (G194). The fifth human specimen (G193) maps closer to the second soil isolate (G190) than to the others.
Results of the AFLP analysis reported here support previous AFLP analyses that showed that this technique could differentiate group I and group II C. botulinum strains (25). The current work extends those findings to include strains that are representative of groups III and IV. This analysis illustrates the relationship of the different genetic backgrounds in the clostridia that contain these neurotoxin genes.
Sequencing and analysis of BoNT genes. To understand how conserved the sequences of the different BoNT genes are within C. botulinum strains of a given serotype, the full-length coding sequence of each BoNT/A, B, and E gene was amplified in overlapping segments by PCR and then sequenced. Comparisons of the neurotoxin sequences generated from the 60 BoNT/A genes sequenced here, as well as six previously published BoNT/A gene sequences, show that at least four distinct groups of BoNT/A sequences exist (Fig. 3). Ninety percent of the BoNT/A-producing strains in this study (54/60 strains) show little sequence variation in the BoNT/A gene and are of the previously reported BoNT/A1 subtype (9, 50). Within this subtype, 37 of the strains share identical sequences and differ from 16 of the remaining 17 strains in this subtype by two nucleotides. These 16 strains are A1(B) strains that contain a silent BoNT/B gene. Sequences were generated from six of the silent BoNT/B genes in these A1(B) strains and compared. All six of the silent BoNT/B sequences were similar to those reported under GenBank accession number AF300467 (26), which generate a truncated protein from a stop codon at amino acid 128. Four of the sequences (A148, A397, A404, and A406) were identical to each other but differed from that reported under accession number AF300467 by two single nucleotide polymorphisms. The other two sequences (A408 and A411) were identical to each other but different from the other four silent BoNT/B sequences by a single nucleotide polymorphism. The identification of these different silent BoNT/B gene sequences shows that there are more differences in clostridial strains than revealed by AFLP analysis and 16S rRNA and BoNT/A gene sequence analysis.
![]() View larger version (16K): [in a new window] |
FIG. 3. Comparison of BoNT/A gene sequences. The full-length coding region of the BoNT/A gene in 60 strains and six GenBank sequences were aligned. Four distinct subtypes are apparent. Most strains (54 strains) are of the BoNT/A1 subtype, and four strains are within the BoNT/A2 subtype. Two newly identified subtypes, BoNT/A3 and BoNT/A4, each contain one member: the A254 (Loch Maree) strain and the bivalent Ba207 strain, respectively. These strains show significant sequence variations compared to BoNT/A1 and A2 subtypes.
|
|
View this table: [in a new window] |
TABLE 3. Nucleotide and amino acid identities in strains representing the BoNT/A, B, and E subtypesa
|
Comparison of the 53 BoNT/B genes sequenced for this work and an additional 7 previously reported BoNT/B genes also demonstrated the existence of four, or possibly five, distinct groups. These groups represent the four previously described BoNT/B subtypes, BoNT/B1, BoNT/B2, bivalent BoNT/B, and nonproteolytic BoNT/B (Fig. 5) (20, 23, 27, 37, 48). Compared to BoNT/A, each subtype had more members, with BoNT/B2 being produced by the largest number of strains in our collection. There was also more nucleotide variation within members of each cluster compared to BoNT/A. Nucleotide and amino acid comparisons of the four BoNT/B subtypes are shown in Table 3. Nucleotide differences range from 2% to 4%, with amino acid differences ranging from 4% to 6%. A single BoNT/B isolate (B506) that differed from the closest BoNT/B2 strain by 33 nucleotides, which represents a 2% difference at the amino acid level, was sequenced.
![]() View larger version (17K): [in a new window] |
FIG. 5. Comparison of BoNT/B gene sequences. The full-length coding regions of the BoNT/B gene in 53 strains and seven GenBank sequences were aligned. Four distinct clusters that include the BoNT/B1 and BoNT/B2 and bivalent (Ab149, Ba207, Bf258, and Bf698) and nonproteolytic BoNT/B subtypes are apparent. Most strains are of the BoNT/B2 subtype, with 16 strains being of the BoNT/B1 subtype. Strain B506 is separate from the other BoNT/B2 strains and represents a newly identified variation in this serotype.
|
![]() View larger version (13K): [in a new window] |
FIG. 6. Comparison of BoNT/E gene sequences. The full-length coding regions of the BoNT/E gene in 21 strains and 15 GenBank sequences were aligned, resulting in five clusters labeled E1 to E5. Two clusters contain sequences from C. butyricum BoNT/E strains collected in Italy (E It. butyr.) or China (E Ch. butyr.). The other subtypes include BoNT/E1 and E2 and a newly identified subtype, labeled BoNT/E3, containing four members (E185, E540, E545, and E549).
|
![]() View larger version (19K): [in a new window] |
FIG. 7. Comparison of the seven different serotypes of BoNT gene sequences. Shown is a neighbor-joining alignment of the nucleotide coding regions of the seven BoNT genes (A through G) including the tetanus toxin. The comparison of the BoNT genes shows a different relationship of the serotypes than what is found based on 16S rRNA genes or AFLP analysis. Nonproteolytic and bivalent strains (Ba207 and Ab149) and representatives of the different subtypes are included.
|
|
|
|---|
The taxonomy of the C. botulinum species has historically been based on the identification and/or expression of botulinum toxin genes (38). Since C. butyricum and C baratii strains that contain BoNT genes have been identified (2, 10, 16, 35), the taxonomy of the toxin-producing clostridia has become more complex. The dendrogram generated using 16S rRNA gene sequence data suggests that the different botulinum neurotoxins that define the species Clostridium botulinum are actually contained in genomes from four different clostridial species. The 16S rRNA gene dendrogram demonstrates that BoNT/A-, BoNT/B-, and BoNT/F-producing strains are closely related to each other and to Clostridium sporogenes and probably evolved from a common ancestor. However, the genomes for the BoNT/C-, D-, E-, and G-producing strains have 16S rRNA gene sequence profiles that closely align to distant clostridial relatives including C. novyi/C. haemolyticum, C. baratii, and C. subterminale (Fig. 1). The results reported here should not change the basic nomenclature for C. botulinum in order to avoid confusion and because these taxonomic designations have been based on strong phenotypic as well as genotypic characteristics. However, the presence of related toxin genes in distantly related clostridia serves as a reminder that horizontal gene transfer has played a significant role in the evolution of Clostridium botulinum.
AFLP analysis of these strains illustrates clustering by group designation and by toxin serotype. The AFLP-based dendrogram divides the strains into clusters that follow the group I to group IV designations, which are based on physiological characteristics. AFLP analysis clearly separates the proteolytic and nonproteolytic groups and shows the relationship of the genomic backgrounds among strains that are usually defined by the expression of a single 3.8-kb BoNT gene into one of seven different serotypes. This AFLP analysis shows a close relationship of BoNT/A1 subtypes to the A1(B) strains that are distant from the BoNT/A2 and BoNT/A3 subtypes that lie within the BoNT/B1 and BoNT/B2 subtypes. AFLP also shows relationships among the proteolytic BoNT/B and BoNT/F isolates that are mirrored in the nonproteolytic BoNT/B and BoNT/F branches. AFLP analysis supports the group III clustering of BoNT/C and BoNT/D serotypes and the clustering of the group IV BoNT/G strains as distinct from the other serotypes.
Four out of the five bivalent strains in this study cluster together within a branch of the AFLP-based dendrogram that also contains strain A254, which produces BoNT/A3. This branch is also related to a branch containing three BoNT/A2-producing strains, including the remaining bivalent strain, Af695. These bivalent strains contain BoNT/A, B, and F genes and were all isolated from infant botulism cases in different geographic locations. The genomes of these isolates cannot be distinguished by AFLP, yet these strains contain different combinations of neurotoxin genes. The sequences of the individual neurotoxin genes show that the BoNT/B gene sequence in all of these strains is of the same subtype (not identical sequences) but that the BoNT/A genes differ, representing different subtypes. Ab149 contains a BoNT/A2 subtype sequence, but Ba207 contains a completely new BoNT/A subtype that we have termed BoNT/A4. These results suggest either that two lineages of a single strain already carrying the BoNT/B gene acquired the BoNT/A2 and BoNT/A4 genes horizontally or that two strains carrying BoNT/A2 and BoNT/A4 genes both acquired the same BoNT/B gene horizontally. Southern blotting or genome analysis of toxin gene integration sites would be necessary to distinguish between these possibilities.
Four of the five strains producing two BoNT serotypes (bivalent strains) were isolated from infants with botulism. This high proportion of bivalent strains found in infants might reflect sample bias within this collection, but this has been reported previously by others (4). Of the 10 strains isolated from infants, 4 were found to be bivalent in this study. These 10 strains that affected infants are located in different branches of the AFLP dendrogram and include a C. butyricum BoNT/E-producing strain (E543) from Italy. An examination of the sequences of the BoNT/A, B, and E genes from these strains from infants shows that the toxin gene frequently represents a unique cluster within the serotype. Within both the BoNT/A and the BoNT/B gene sequence-based dendrograms, three of the nine clusters in the trees contain BoNT produced by strains from infant cases; all of the bivalent strains producing BoNT/B form a unique cluster, as does the single strain (E543) producing BoNT/E. It must be noted, however, that the strains obtained from infants in this collection were deliberately chosen for their unusual characteristics and that a large collection of infant isolates may show higher percentages of the more common BoNT/A1 and BoNT/B1 subtypes.
The neurotoxin gene sequence comparisons of all of the toxin serotypes (serotypes A to G) suggest that the BoNT gene has evolved separately in different genomic backgrounds. The dendrogram indicates that the seven BoNT genes form three distinct clusters: a large cluster consisting of the A, E, and F neurotoxins; a second cluster comprised of the B and G toxins; and a third cluster comprised of the C and D toxins. These relationships are different from the group I to group IV designations supported by the 16S rRNA gene sequences and AFLP analysis. This discordant phylogeny suggests gene transfer among different clostridial species and suggests that C. botulinum has contributed to the movement of the BoNT gene into various genetic backgrounds. The nucleotide differences within these neurotoxin genes are probably a result of both natural variation and selection pressure. Recombination events, similar to that illustrated in Fig. 4, where an A1/A3 recombination created BoNT/A2, can also be found within other toxin gene lineages, including many C/D and D/C interserotype recombination events that have previously been reported (31). Several recombination events within the nontoxic nonhemagglutinin genes of A1, B, and F strains have also been described (11).
The current analysis of 134 BoNT/A, B, and E toxin genes significantly increases our understanding of the extent of subtype variability within these three serotypes. The neurotoxin sequences demonstrate that there is more diversity within these toxin serotypes than previously known (summarized in reference 39). Two new BoNT/A genes, one new BoNT/B gene, and two new BoNT/E genes were identified. The two new BoNT/A genes clearly represent new BoNT/A subtypes that we have termed BoNT/A3 and BoNT/A4. Subtypes have historically been defined by the differential binding of monoclonal antibodies (13, 28, 39), and the 15% and 11% amino acid differences between BoNT/A1, A3, and A4 would certainly result in differential binding of some BoNT/A monoclonal antibodies (39). The toxins encoded by the new BoNT/B gene (BoNT/B3) and the new BoNT/E genes (BoNT/E2 and E3) differ from BoNT/B1 and BoNT/E1 by 4%, 1%, and 2% at the amino acid level, respectively. It is not clear whether these new toxins represent new toxin subtypes using the historical standard of monoclonal antibody binding. While single amino acid changes can cause a loss of antibody binding, whether the amino acid differences in these toxins are large enough to result in differential monoclonal antibody binding is unknown and must await the completion of studies using panels of monoclonal antibodies. However, lacking monoclonal antibody studies, subtypes could also be defined based on nucleotide or, more appropriately, amino acid differences, especially where multiple members are identified from different strains. This is the case for the BoNT/E2 and E3 genes.
Accurate analyses and understanding of the recombinations between toxin genes of different serotypes and subtypes may be more helpful for identifying potential vaccines and therapeutic antibodies than relying on phylogenetic dendrograms or overall pairwise sequence distances. For example, BoNT/A2 represents a recombination of the 5' end of the BoNT/A1 light-chain gene with the 3' end of the BoNT/A3 gene. This analysis permits the identification of regions of BoNT that could be used to generate antibodies that can cross-react with all three subtypes. Similarly, knowledge of the recombination site between BoNT/C and D will allow the identification of targeted regions for the generation of antibodies that cross-react with chimeric BoNT/C and D.
In conclusion, the neurotoxins produced by Clostridium tetani, C. butyricum, and C. baratii are as similar, or more similar, to C. botulinum neurotoxins as the various serotypes of BoNT are to each other (36). Historically, the expression of these neurotoxins has been used to taxonomically identify these clostridia as C. botulinum or C. tetani. The presence of these toxins in different genetic backgrounds suggests their movement both within the species and among other species. Most of these bacteria are distributed throughout the world, yet there is no known geographical relationship to the genetic diversity. Environmental niches, geographic distribution, and gene transfer mechanisms among these spore-forming clostridia must all interact to produce the sequence diversity observed in one of the most lethal neurotoxins known. The BoNTs produced by these clostridial species show sequence differences both within and between serotypes. Identifying the extent of these differences is the crucial first step in the development of improved diagnostics and therapeutics for the treatment of botulism.
We thank the DOE Joint Genome Institute (JGI) at Los Alamos National Laboratory for their support by providing technical assistance and facilities for DNA sequencing. We thank Stephen Arnon for his thorough review of the manuscript.
Opinions, interpretations, conclusions, and recommendations are those of the authors and are not necessarily endorsed by the U.S. Army.
Published ahead of print on 17 November 2006. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»