Previous Article | Next Article 
Journal of Bacteriology, July 2000, p. 3989-3997, Vol. 182, No. 14
0021-9193/00/$04.00+0
Copyright © 2000, American Society for Microbiology. All rights reserved.
vrrB, a Hypervariable Open Reading Frame
in Bacillus anthracis
James M.
Schupp,
Alexandra M.
Klevytska,
Guenevier
Zinser,
Lance B.
Price, and
Paul
Keim*
Department of Biological Sciences, Northern
Arizona University, Flagstaff, Arizona 86011-5640
Received 25 February 2000/Accepted 18 April 2000
 |
ABSTRACT |
Bacillus anthracis appears to be the most molecularly
homogeneous bacterial species known. Extensive surveys of worldwide isolates have revealed vanishingly small amounts of genomic variation. The biological importance of the resting-stage spore may lead to very
low evolutionary rates and, perhaps, to the lack of potentially adaptive genetic variation. In contrast to the overall homogeneity, some gene coding regions contain hypervariability that is translated into protein variation. During marker analysis of diverse strains, we
have discovered a novel ca. 750-nucleotide open reading frame (ORF)
that contains in-frame, variable-number tandem-repeat sequences. Four
distinct variable regions exist within vrrB, giving rise to
11 distinct alleles in eight different length categories among B. anthracis strains. This ORF putatively codes for a 241- to 265-amino-acid protein, rich in glutamine (13.2%), glycine
(23.4%), and histidine (23.0%). The variable-region amino acids of
the vrrB ORF are strongly hydrophilic. Coupled with
putative transmembrane domains flanking the variable regions, this
suggests a membrane-anchored cytosolic or extracellular location for
the putative protein. Sequence analysis of the complete ORFs
from three Bacillus cereus strains shows maintenance of the
ORF across species boundaries, including strong conservation of the
amino acid sequence and the capacity to vary among strains. The
presence of 11 different alleles of the vrrB locus is in
stark contrast to the near homogeneity of B. anthracis. Evolution of hypervariable genes can negate the lack
of genetic variability in species such as B. anthracis
and provide select rapid evolution in other more variable species.
 |
INTRODUCTION |
Bacillus anthracis is
found throughout the world in a wide range of environments and in a
wide variety of large mammalian hosts. The pathogen is thought to have
evolved very recently, in the last 10,000 to 20,000 years, from a
Bacillus cereus or Bacillus thuringiensis strain
that fortuitously acquired the two anthrax virulence plasmids. There is
limited evidence suggesting that B. anthracis can replicate
as a free-living soil bacterium under optimal conditions of high soil
moisture, alkaline pH, and sufficient nutrient availability, although
this has yet to be demonstrated conclusively (25). In
most environments, B. anthracis more likely lies
dormant in the soil as a spore between deadly infections. However, upon
infection, rapid vegetative clonal expansion occurs within the host
until the host dies or overcomes the microbe. Once dead, the host's
body fluids, containing high numbers of the offending bacilli, are
leaked into the surrounding soil, setting up the next infection cycle,
whether it be within hours, days, or years.
The evolutionary change of organisms usually entails three distinct
processes: mutation, recombination, and selection. Generation time is
an important parameter for mutation and selection. Mutations, which
provide raw genetic variation, generally occur during DNA replication
and are frequently measured on a per-generation basis. Bacterial
genetic recombination occurs via the exchange of genetic material
through phages and other mobile genetic elements, creating novel gene
combinations. Selection acts on mutational and recombinational changes
to influence differential propagation of genetic types and is greatly
influenced by the number of generations involved. In general,
evolutionary change will increase with an increasing number of generations.
In B. anthracis, the period of vegetative expansion within
the host represents the mostly likely time for evolutionary change as
mutation rates, recombination, and selective pressure will be greatly
reduced during the resting spore stage. Because no propagation is
occurring in the spore stage, differential loss of mortality can be
most affected by selection there. Environmental selection is certainly
present at all B. anthracis growth stages, but it must have
genetic variation upon which to act. Genetic recombination among
B. anthracis strains, or even with other species, must be
relatively rare given the explosive and short nature of the B. anthracis vegetative growth stage. Phylogenetic analysis of
plasmid and chromosomal sequences found no evidence of horizontal transfer (an obvious form of recombination) of a virulence plasmid (pX01) among diverse strains (20). Genetic variation
generated by mutation would appear to be a limiting factor for B. anthracis evolution and adaptation.
In many bacteria, it has been shown that variable-number tandem repeats
(VNTRs) contained within genes and nongenic regions are extremely
diverse (26). VNTRs have been found to affect regulation and
product function in genes associated with pathogenesis in a variety of
bacterial pathogens. Intragenic VNTRs have been found to affect
lipopolysaccharide (LPS) phase variation in Haemophilus influenzae (28). LPS phase variation has been shown to
function in immune evasion and translocation in Neisseria
gonorrhoeae (27). A pentameric VNTR causing independent
translational frameshifts within the members of a family of outer
membrane protein genes associated with epithelial invasion has been
discovered and characterized in N. gonorrhoeae. This VNTR
provides a mechanism for antigenic variation or shifting (15, 19,
23). N. gonorrhoeae also exhibits LPS variation, and
while a mechanism of variation has yet to be discovered, a VNTR may be
involved. The M protein genes in group A streptococci have been shown
to contain variable repetitive DNA elements, resulting in differential
protection against phagocytosis (2). Variable repetitive DNA
elements in the alpha C protein genes of group B streptococci have been
shown to affect differential protection from antibody-mediated killing
(14).
In B. anthracis, such VNTR gene variation has been
documented previously at the vrrA locus (1, 6).
Insertion and deletion events accounted for half of the 30 marker
differences in an extensive survey of strains by using amplified
fragment length polymorphism (AFLP) markers (10). In this
report, we demonstrate that at least some of this rare AFLP variation
is due to VNTRs, as the vrrB locus was first detected as a
five-allele AFLP marker (10). Upon sequence
characterization, a complex repetitive region was found within a large
open reading frame (ORF). A total of 11 alleles were found within eight
different size classes, resulting from combinations of
9-nucleotide insertion-deletion polymorphisms that maintain the
translational reading frame. This VNTR variation is of great use in
B. anthracis typing and may also provide a source of genetic
differences for evolutionary change in this highly homogeneous species.
 |
MATERIALS AND METHODS |
B. anthracis isolates and DNA extraction and
purification.
Isolates were obtained from different sources (Table
1). The isolates were cultured, and DNA
was extracted and purified as previously described (7).
AFLP fragment extraction and sequencing.
All reagents were
obtained from Life Technologies, Inc., Gaithersburg, Md., unless
otherwise noted. AFLP analysis was performed as previously described
(10). Polymorphic AFLP EcoRI-MseI C/T +1/+1 fragments (10) corresponding to four different
alleles, ca. 600 bp in size, were extracted from a dried 6%
polyacrylamide gel, amplified, and sequenced as previously described
(21).
Isolation of entire vrrB ORF and flanking
regions.
Ligation-mediated suppression PCR was used to obtain the
entire vrrB ORF and surrounding regions as previously
described (21). The upstream-oriented locus-specific primer
was CT600u1 (5'-CCCATTGATGTAGGCATTCCTG-3'), and the
downstream-oriented primer was CT600d1
(5'-ATCAACAACAATCTTCACCTTGGG-3').
PCR amplification.
The hypervariable regions from 24 isolates were amplified as follows. Five nanograms of genomic DNA, 40 pmol of primer vrrBHR1f (5'-ATAGGTGGTTTTCCGCAAGTTATTC-3'),
40 pmol of primer vrrBHR2r (5'-CCCAAGGTGAAGATTGTTGTTGA-3'),
100 µM (each) dinucleoside triphosphate (dNTP), 2 mM
MgCl2, 10 µl of 10× PCR buffer, 5 U of Taq
DNA polymerase, and double-distilled H2O
(ddH2O) were added to a final volume of 100 µl. The
reaction mixtures were incubated at 94°C for 3 min and then cycled at
94°C for 30 s, 65°C for 20 s, and 72°C for 20 s
for 35 cycles, with a final 72°C incubation for 2 min. The
amplification products were purified using a Qiaquick PCR purification
kit (Qiagen Inc., Valencia, Calif.) and sequenced on an ABI 377 fluorescent sequencer using the PCR amplification primers.
The entire ORF from each of 10 diverse isolates (Table
1) was amplified
by PCR as follows. Twenty picomoles of primer VRRBCODF1
(5'-ACTTCCGAAAGAATATGTAGAAGGTT-3'), 20 pmol of primer
VRRBCODR1
(5'-GAGTTTTATGCAAGAAGAGCTAGAAGA-3'), 5 ng of
genomic DNA, 200
µM (each) dNTP, 2 mM MgCl
2, 5 µl of
10× PCR buffer, 2 U of
Taq DNA polymerase, and
ddH
2O were added to a final volume of 50 µl.
The reaction
mixtures were incubated at 94°C for 3 min and then
cycled at 94°C
for 30 s, 63°C for 30 s, and 72°C for 60 s for
45 cycles, with a final 72°C incubation for 5 min. The amplification
products were purified and sequenced as described above with the
use of
an additional internal forward primer (VRRBF2;
5'-AGAAGCGGAATTCCAATACAGAC-3').
A large portion of each of the
vrrB ORFs from three American
Type Culture Collection (ATCC) strains of
B. cereus
(
B. cereus 11778,
B. cereus 31293, and
B. cereus 43881) was amplified as
follows. Twenty
picomoles of primer vrbF3 (5'-GGATGGACAATAATGCACCAC-3'),
20 pmol of primer vrbR3-1 (5'-AACGTCAACGCCAAGAAGC-3'), 5 ng of
genomic DNA, 200 µM (each) dNTP, 2 mM MgCl
2, 5 µl of
10× PCR buffer,
2 U of
Taq DNA polymerase, and
ddH
2O were added to a final volume
of 50 µl. The reaction
mixtures were incubated at 94°C for 3 min
and then cycled at 94°C
for 30 s, 55°C for 30 s, and 72°C for
60 s for 45 cycles, with a final 72°C incubation for 5 min. The
amplification
products were purified and sequenced as described
above.
DNA sequence analysis.
Multiple sequence alignment analysis
(CLUSTAL) and dot plot similarity analysis was performed with the
MEGALIGN subroutine in the LASERGENE software package (DNASTAR Inc.,
Madison, Wis.). The multiple sequence alignment analysis was optimized
to conserve nucleic acid residues. Dot plot similarity analysis was
performed with a 9-nucleotide or 3-amino-acid window with an 80 or
100% similarity requirement, respectively.
Predictive structure of the encoded protein.
The structure
of the putative vrrB protein encoded by the ORF containing
the VNTRs was predicted using the PROTEAN subroutine in the LASERGENE
software package. This subroutine uses the Chou-Fasman (3)
and the Garnier-Robson (5) algorithms for predicting alpha,
beta, and turn regions, the Garnier-Robson algorithm for predicting
coil regions, the Kyte-Doolittle (13) algorithm for predicting hydrophilicity, the Emini et al. (4) algorithm
for predicting surface probability, the Karplus-Schultz (9)
algorithm for predicting flexibility, and the Jameson-Wolf
(8) algorithm for predicting antigenicity. The TMPRED
algorithm (http://www.ch.embnet.org /software/TMPRED_form.html)
was used to predict putative transmembrane regions.
Nucleotide sequence accession numbers.
The GenBank accession
number for the complete vrrB ORF and surrounding sequence
from B. anthracis isolate 83 is AF238885. Accession numbers
for the vrrB hypervariable-region sequences from select
B. anthracis isolates shown in Table 1 are AF238889 (2PT),
AF238890 (MOZ-3), AF238891 (A46), AF238892 (BA0015), AF238893
(Vollum), AF238894 (A24), AF238895 (Zim69), AF238896 (BA1015),
AF238897 (J611), AF238898 (109), and AF238899 (BA1035). The accession
numbers for the partial vrrB ORF sequences from B. cereus strains ATCC 11778, ATCC 31293, and ATCC 43881 are
AF238886, AF238887, and AF238888, respectively.
 |
RESULTS |
Identification and characterization of vrrB.
A highly
polymorphic and apparently allelic set of AFLP DNA fragments (fragment
ECO-C/MSE-T) identified in a strain diversity study (10)
were characterized by DNA sequencing. Figure
1 shows a 795-nucleotide ORF
(vrrB) from isolate 83 (Table 1), which contains a
hypervariable region of tandemly repeated sequences (see below), as
well as the conserved surrounding sequences. Comparison of the complete
ORFs from 10 diverse isolates revealed three single-nucleotide polymorphisms (SNPs), two of which resulted in amino acid changes: alanine-75 to threonine-75 and histidine-123 to glutamine-123 (Fig. 1).
All three SNPs were found to be in linkage disequilibrium in the 10 strains examined and hence defined only two alleles. However, even
three SNPs represent highly significant variation in B. anthracis, as the 5' portion of vrrB and the 5'
intergenic regions did not vary among 25 diverse strains (L. Price,
unpublished data).

View larger version (56K):
[in this window]
[in a new window]
|
FIG. 1.
Nucleotide sequence of the B. anthracis vrrB
ORF. The entire nucleotide sequence for the largest vrrB
allele (K) is presented along with flanking regions containing the
leuS gene (5') and an unidentified ORF (3'). The deduced
amino acid sequence of the vrrB ORF is shown below the
nucleotide sequence. Known nucleotide differences among B. anthracis strains are shown above the nucleotide sequence.
Putative amino acid changes within the vrrB ORF are shown
below the amino acid sequence. The opposed arrows beneath the
nucleotide sequence indicate regions of possible stem-loop structures.
The arrows above the nucleotide sequence indicate ORFs. Putative RBS as
well as start and stop codons are indicated by boxes.
|
|
Putative ribosome binding sites (RBS) are found 8 and 9 bases
upstream of the first and second methionine residues of the
vrrB ORF, respectively. No other possible start
codons were found
to have putative RBS. None of a number of
Bacillus subtilis promoter
sequences associated with
different sigma factors, including housekeeping
genes and
sporulation genes (
16), were found upstream of the
vrrB ORF. A strong stem-loop structure just downstream of
the
vrrB ORF stop codon may be a

-independent
transcription termination
signal. Another possible stem-loop structure
48 nucleotides upstream
of the first possible
vrrB start
codon may act as an expression
regulation signal. A BLAST search of
GenBank with the deduced
vrrB amino acid sequence found no
similarity with any publicly
available protein sequence, including the
complete
B. subtilis genome.
We discovered ORFs both upstream and downstream from the
vrrB ORF (Fig.
1). The upstream flanking ORF is oriented in
the same
direction as the
vrrB ORF. In this case, GenBank
BLAST searches
revealed very strong deduced amino acid sequence
similarity with
the 3' end of the
B. subtilis leucyl-tRNA
synthetase gene (
leuS;
GenBank accession no.
M88581). A
strong stem-loop structure
just downstream of the
B. anthracis
leuS gene is presumably the

-independent transcription
termination signal. The downstream
ORF was found in the orientation
opposite that of the
vrrB ORF
(Fig.
1). A BLAST search
revealed no sequence similarity with
any known protein or ORF.
Transcription of this ORF may be terminated
by the same putative
vrrB ORF

-independent transcription termination
signal,
but from the opposite
direction.
Comparison of hypervariable regions among B. anthracis
strains.
We have found that the hypervariable region is due to
VNTRs. This region was amplified from 24 diverse B. anthracis strains and then sequenced. A CLUSTAL multiple-sequence
alignment optimized for base conservation revealed eight different size
classes and multiple alleles within three of the size classes (Fig.
2). Altogether, we have identified the 11 alleles shown in Fig. 2. The vrrB ORF sizes in the different
alleles range from 723 to 795 nucleotides. All allele sizes are in
increments of 9 nucleotides except for alleles I and J, which are
separated in size by 18 nucleotides. The hypervariable region appears
to consist of a series of 9-nucleotide degenerate repeat units with a
consensus sequence of
CA(C8/T11)CA(A8/C7/T4)GG(A3/C3/G1/T12) or
CA(A3/C1/T5)CA(A1/C4/G2/T2)CA(A2/C6/T1),
with some minor variation, leading to primarily HHG or QQY amino acid
repeat units. Note that the variable positions within the consensus
9-nucleotide repeat represent the third codon position.

View larger version (103K):
[in this window]
[in a new window]
|
FIG. 2.
Multiple alignment of the hypervariable
regions of the vrrB ORFs from all observed B. anthracis allele types. Nucleic acid sequences from the
hypervariable regions of the vrrB ORFs from all 11 observed
allele types were aligned, optimizing for nucleotide conservation. The
shaded rectangles indicate every other degenerative 9-bp repeat. The
arrows indicate the positions of primers used to amplify the
vrrB1 and vrrB2 regions.
The asterisks indicate the positions of point mutations. The consensus
deduced amino acid sequence is shown below the nucleic acid alignment.
The table at the bottom provides the amplicon size and observed
frequency of each vrrB1 and
vrrB1 allele. The DIs for individual and
combined regions are shown at the bottom of the table. Note that the H
and I combined alleles are indistinguishable by PCR analysis and that
the frequency of 0.04 is for both alleles taken together.
|
|
Just upstream of the first insertion-deletion region in the
hypervariable region are two of the three observed SNPs, an A-C
and a
C-T at positions 1017 and 1023, respectively (Fig.
1). Both
of these
mutations seem to be linked to particular insertion-deletion
patterns.
vrrB alleles A, D, E, G, H, and I have similar
insertion-deletion
patterns (Fig.
2) and share the A and C SNPs.
vrrB alleles B,
C, F, J, and K also have similar
insertion-deletion patterns (Fig.
2) and share the C and T mutations.
It would seem that the random
generation of SNPs and insertions and
deletions are correlated
in the evolutionary history of these
alleles.
Two related but distinct variable regions.
Repeated sequence
variation occurs in two or more semiautonomous regions in the
vrrB ORF. A short nonrepetitive sequence in the center of
the hypervariable region was used as a PCR priming site to
independently amplify each variable region (Fig. 2). PCR fragment size
analysis revealed five size classes for hypervariable region one
(vrrB1) and four size classes for hypervariable
region two (vrrB2). The
vrrB1 amplicon sizes were 183, 192, 219, 228, and 255 bp. The vrrB2 amplicon sizes were 142, 151, 160, and 169 bp. Fluorescently labeled PCR primers were used to
examine 425 B. anthracis isolates for vrrB
variation based upon amplicon size (12). Allele frequencies
based upon this large survey for each sublocus and the entire
vrrB locus are shown in Fig. 2. The 228-nucleotide vrrB1 allele and the 160-nucleotide
vrrB2 allele are very common in B. anthracis strains, making up ~83 and ~77% of all alleles observed, respectively.
The sublocus
vrrB1 contains two
insertion-deletion regions, with the upstreammost 63 bp in size and the
downstreammost 36
bp in size (Fig.
2). The primary variable amino acid
repeat motif
is HHG for both. The diversity index (DI), calculated by
subtracting
the sum of the squared allele frequencies from 1, is
0.30 for
the
vrrB1 sublocus.
The sublocus
vrrB2 contains three
insertion-deletion regions that are smaller than those found in
vrrB1 and that range in
size from 9 to 36 bp
(Fig.
2). The first part of the
vrrB2 region
contains an HHG repeat, the second part contains a QQH, and the
third
part contains a QQY, indicating greater complexity than
vrrB1. The
vrrB2 sublocus
diversity is 0.38, slightly greater
than that observed for
vrrB1. We have compared the repeated sequence
array sizes of
vrrB1 and
vrrB2 and found no positive or negative
correlation among insertion-deletion patterns. The diversity value
for
the combined regions is 0.51. Hence, the comparison of any
two
B. anthracis strains at the
vrrB ORF has a 51%
probability
of revealing an allele
difference.
The repetitive nature of the
vrrB hypervariable region is
modular and can be further broken down into four distinct regions.
The
subloci defined above (
vrrB1 and
vrrB2) are based upon our
ability to design
unique PCR primers, but structure within these
subloci is also apparent
(Fig.
2). The
vrrB2 substructure is most
obvious, with low similarity (5 of 9 nucleotides matching; 55%)
between the regions.
vrrB1, in contrast, has two
regions with
high similarity that are separated by a low-similarity
region
that is only obvious in the 100% homology dot plot analysis
(Fig.
3).

View larger version (38K):
[in this window]
[in a new window]
|
FIG. 3.
Dot plot homology analysis of the entire vrrB
ORF from B. anthracis. Dot plot homology analysis was
performed on both the nucleotide sequence (top right; 9-bp window; 80%
similarity) and the deduced amino acid sequence (bottom left;
3-amino-acid window; 100% similarity). Sequence homology among repeat
regions is indicated by corresponding diagonal lines. The boxes
indicate regions where insertion-deletion events are found. The
hypervariable regions vrrB1 and
vrrB2 are indicated by brackets.
|
|
Predicted protein structure.
The dramatic size differences
among vrrB alleles are translated into different sized
proteins with potentially great differences in structure. The putative
vrrB protein size ranges from 241 to 265 amino acids, with
predicted molecular masses ranging from 24.9 to 27.8 kDa. The predicted
charge, at pH 7.0, ranges from 4.6 to 7.0, with the isoelectric point
varying from 7.4 to 7.5 for the different alleles. Common amino acids
in the predicted vrrB protein include ~23% glycine, 19.1 to 23% histidine, and 13% glutamine, all of which are found in
the variable-repeat region. Significant codon usage bias was seen with
six of seven of the most abundant amino acids. No codon bias was
observed with respect to histidine. The protein secondary-structure
predictions using PROTEAN (Fig. 4)
suggest that the two vrrB subloci code for highly hydrophilic regions, separated by a small hydrophobic region. The 3'
subregion within vrrB2 has a slightly
different amino acid composition that is evident in the hydrophilicity
plot and especially in the surface probability prediction, suggesting
it is the most likely portion of the protein to be exposed. The entire
hypervariable region is flanked by hydrophobic regions that show strong
similarity to known transmembrane regions of other proteins (by TMPRED
analysis).

View larger version (31K):
[in this window]
[in a new window]
|
FIG. 4.
Protein structure predictions for a putative
vrrB L allele protein. Structural characteristics for the
largest vrrB ORF putative protein (K) were predicted using
the PROTEAN subroutine of the Lasergene software package. Putative
transmembrane regions were predicted by TMPRED
(http://www.ch.embnet.org/software /TMPRED_form.html). The
hypervariable regions are indicated above the scale.
|
|
vrrB in B. cereus.
We have examined the
phylogenetic distribution of the vrrB gene and found it only
in bacteria closely related to B. anthracis. No signal was
observed from Southern hybridization of vrrB probes to
genomic DNAs of B. subtilis and Bacillus
megaterium (data not presented). Likewise, BLAST searches of the
complete B. subtilis genome found no significant similarity
to vrrB. vrrB probes did, however, hybridize to B. thuringiensis, Bacillus mycoides, and B. cereus DNAs during Southern analysis (data not presented). vrrB PCR challenges successfully amplified only three
B. cereus samples under our PCR conditions. We determined
these B. cereus vrrB ORF sequences, minus the 5'-most 45 to
60 bases, and found essentially the same ORF with no stop codons.
Strong deduced amino acid sequence conservation was also observed, with
most of the nucleotide differences occurring in the third codon
position (Table 2). The three B. cereus strains were nearly as different from each other (29.5%
difference) as they were from B. anthracis (30.4%). In all
comparisons, synonymous differences were equal to or predominant over
nonsynonymous ones (Table 2). In addition, amino acid differences were
generally within chemical residue categories (e.g., a neutral amino
acid for a neutral amino acid). The conservation of this ORF among
strains and across species boundaries argues strongly that it is a gene
with a functional protein product. In addition to the sequence
conservation, the hypervariable-repeat feature was also conserved
across species boundaries. Many in-frame insertion-deletion events were
found between B. anthracis and B. cereus, and
even among the B. cereus strains. All but two were in
association with the hypervariable regions observed within B. anthracis (Fig. 5). Also note that
highly conserved deduced amino acid sequences are found at the 5' end
of the putative protein, at the predicted transmembrane regions, and at
the short hydrophobic region separating the two hypervariable regions.

View larger version (86K):
[in this window]
[in a new window]
|
FIG. 5.
Comparison of vrrB amino acid sequences in
B. anthracis and B. cereus. The nucleic acid
sequence from the largest B. anthracis vrrB allele and from
all but the 5'-most ends of the vrrB sequences from three
B. cereus ATCC strains were translated into the predicted
amino acid sequence and aligned using the CLUSTAL method. The shaded
areas indicate similarity among the sequences. The number of asterisks
above the B. anthracis sequence positions indicates the
number of synonymous nucleic acid mutations found at that position. The
horizontal arrows above the B. anthracis sequence indicate
the hypervariable-repeat regions. The vertical arrows indicate the
positions of dissimilar amino acid substitutions based upon
physiochemical properties (24). The numbers at the 3' ends
of the sequences indicate the number of deduced amino acids in each
sequence shown.
|
|
 |
DISCUSSION |
vrrB, the gene.
The evolutionary pattern of
nucleotide differences among strains and across species is consistent
with the vrrB ORF being an expressed gene. The ORF remains
intact across 11 alleles within B. anthracis and three
strains of B. cereus, which suggests that this gene encodes
a beneficial, if not an essential, gene product. Conserved amino acid
sequences, particularly at the predicted transmembrane regions, suggest
important roles for these regions of the putative protein.
Interestingly, we have evidence for this gene only in the type I
bacilli and not in other related bacteria, such as B. subtilis. It would seem that this gene has a unique role in this
group of bacteria. Likewise, it is not restricted to B. anthracis and is not likely to have a function unique to pathogens.
Assuming the
vrrB ORF is an expressed gene, a number of the
features further suggest a contingency role as opposed to a
housekeeping
role for the
vrrB ORF. None of a number of
B. subtilis promoter
sequences associated with
different sigma factors, including housekeeping
genes and sporulation
genes (
16), were found upstream of the
vrrB ORF,
suggesting a nonconstitutive and nonsporulation role
for this gene. A
possible stem-loop structure upstream of a weak
RBS could be a
trans-acting regulatory target for altering the
expression
level of the
vrrB ORF. The highly biased codon usage
of the
putative
vrrB gene product is consistent with a highly
expressed gene (
22). While the previous statements are
certainly
very speculative, the discussed characteristics of the
vrrB gene
and surrounding sequence, taken together, strongly
suggest a gene
that is turned on and off in response, or contingent, to
an environmental
cue that requires a rapid response. In addition, the
presence
of the
vrrB gene in some
Bacillus
strains but not others argues
against an essential or housekeeping role
for
vrrB. Rather, it
seems possible that
vrrB has
a unique and adaptive
role.
vrrB evolution.
The presence of
insertion-deletion polymorphisms among B. anthracis isolates
is indicative of the homogenization mechanisms that create and
maintain directly repeated sequences. It has been suggested that the
homogenization mechanism is slip strand repair (26),
although unequal recombination could also be acting. All of the
vrrB repeats are related to some extent, though adjacent repeats tend to have greater similarity. This is consistent with a
cis-dependent homogenization mechanism, which could be
either slip strand repair or recombination. The triplet sequence CAX is
the dominant trinucleotide pattern, which suggests that the present complex repeat structure evolved from a relatively simple (CAX)n trinucleotide array. We believe that
point mutations in the third codon positions may have occurred, because
of their minor affect on protein structure, and that homogenization
then expanded these to adjacent repeats. Eventually, distinct
9-nucleotide repeats were created by successive rounds of mutation and
homogenization. The current strong 9-nucleotide repeat structure and
9-nucleotide insertion-deletion differences among strains indicate that
homogenization is currently acting on this repeat.
The distinctive repeat sequences found between and within
vrrB1 and
vrrB2
illustrate the highly
cis-dependent nature of the
homogenization process. As the repeated region grows, different
parts appear to become autonomous and diverge. The
vrrB1 and
vrrB2 subloci are the extreme examples where a nonrepeated sequence
now
separates these repeat regions. A number of observations suggest
that
the
vrrB1 and
vrrB2
subloci are now independently evolving.
As stated previously, the sizes
of
vrrB1 alleles do not covary
with
vrrB2 allele sizes. The DI of
vrrB2 is slightly greater than
that of
vrrB1. A recent phylogenetic analysis of over
400
B. anthracis isolates using a battery of VNTR markers
shows the
vrrB1 alleles
clustering in two
exclusive genetic groupings, whereas at least
two
vrrB2 alleles are found distributed across
groups (
12).
The latter observation is suggestive of
convergent evolution within
the
vrrB2 sublocus.
Global geographical dispersal of identical
vrrB alleles
within a phylogenetic group could be due to human
transport in the form
of contaminated bone meal or animal hides
(Table
1).
Even within both subloci, two clusters of more similar repeats are
observed (Fig.
3). It appears that there is an optimal
size for
homogenization and that array expansion beyond this limit
leads to
separate dynamic repeats. The relationship between rates
of
homogenization and mutation is crucial to the resulting structure
but
may be difficult to determine as little is currently known
about the
rates of the homogenization
mechanisms.
Adaptive variation?
VNTRs have been shown to have a variety of
functions in bacterial genomes, from gene expression regulation to
antigenic shifting (for a review, see reference 26),
and to result in variation among a number of virulence-associated genes
(2, 14, 15, 19, 23, 27, 28). It has been proposed that
variation within contingency genes enables genetic flexibility while
maintaining overall genome integrity (17). Many bacterial
contingency genes containing VNTRs have high mutation rates, which may
allow for rapid adaptation to changing environmental conditions
(18). Indeed, most of the variation seen to date in B. anthracis is due to VNTRs in ORFs (11, 12). The
presence of VNTRs in an otherwise highly genetically monomorphic
pathogen such as B. anthracis may be important for
generating variation essential for adapting to various hosts and
environments. Frequently, VNTRs have been discovered from the analysis
of genes associated with particular phenotypic changes (e.g., antigenic
shifts), but in this age of high-volume genomics, many more VNTRs are
likely to be identified from genomic analysis without accompanying
phenotypic phenomena. We are conducting an extensive survey of the
newly available B. anthracis genome sequence for additional VNTRs.
While it is reasonable to assume that in some cases this variation will
be effectively neutral, there will be many examples
of biologically
important changes associated with VNTRs. The future
challenge will be
to discover what effect these hypervariable
regions have upon the
biology of the
organism.
 |
ACKNOWLEDGMENTS |
This work was supported by funds from NIH (RO1-GM60795), DOE
(FG03-00NN20102), and The E. Raymond and Ruth Reed Cowden Endowment for Microbiology.
We thank M. E. Hugh-Jones (Department of Epidemiology and
Community Health, College of Veterinary Medicine, Louisiana State University, Baton Rouge) for providing all the B. anthracis
isolates used in this study.
 |
FOOTNOTES |
*
Corresponding author. Mailing address: Department of
Biological Sciences, Northern Arizona University, Flagstaff, AZ
86011-5640. Phone and fax: (520) 523-1078. E-mail:
Paul.Keim{at}nau.edu.
 |
REFERENCES |
| 1.
|
Andersen, G. L.,
J. M. Simchock, and K. H. Wilson.
1996.
Identification of a region of genetic variability among Bacillus anthracis strains and related species.
J. Bacteriol.
178:377-384[Abstract/Free Full Text].
|
| 2.
|
Bessen, D.,
K. F. Jones, and V. A. Fischetti.
1989.
Evidence for two distinct classes of streptococcal M proteins and their relationship to rheumatic fever.
J. Exp. Med.
169:269-283[Abstract/Free Full Text].
|
| 3.
|
Chou, P. Y., and G. D. Fasman.
1978.
Prediction of the secondary structure of proteins from their amino acid sequence.
Adv. Enzymol. Relat. Areas Mol. Biol.
47:45-148[Medline].
|
| 4.
|
Emini, E. A.,
J. V. Hughes,
D. S. Perlow, and J. Boger.
1985.
Induction of hepatitis A virus-neutralizing antibody by a virus-specific synthetic peptide.
J. Virol.
55:836-839[Abstract/Free Full Text].
|
| 5.
|
Garnier, J.,
D. J. Osguthorpe, and B. Robson.
1978.
Analysis of the accuracy and implications of a simple method for predicting the secondary structure of globular proteins.
J. Mol. Biol.
120:97-120[CrossRef][Medline].
|
| 6.
|
Harrell, L. J.,
G. L. Andersen, and K. H. Wilson.
1995.
Genetic variability of Bacillus anthracis and related species.
J. Clin. Microbiol.
33:1847-1850[Abstract].
|
| 7.
|
Jackson, P. J.,
E. A. Walthers,
A. S. Kalif,
K. L. Richmond,
D. M. Adair,
K. K. Hill,
C. R. Kuske,
G. L. Andersen,
K. H. Wilson,
M. E. Hugh-Jones, and P. Keim.
1997.
Characterization of the variable-number tandem repeats in vrrA from different Bacillus anthracis isolates.
Appl. Environ. Microbiol.
63:1400-1405[Abstract].
|
| 8.
|
Jameson, B. A., and H. Wolf.
1988.
The antigenic index: a novel algorithm for predicting antigenic determinants.
Comput. Appl. Biosci.
4:181-186[Abstract/Free Full Text].
|
| 9.
|
Karplus, P. A., and G. E. Schultz.
1985.
Prediction of chain flexibility in proteins.
Naturwissenschaften
72:212-213[CrossRef].
|
| 10.
|
Keim, P.,
A. Kalif,
J. Schupp,
K. Hill,
S. E. Travis,
K. Richmond,
D. M. Adair,
M. E. Hugh-Jones,
C. R. Kuske, and P. Jackson.
1997.
Molecular evolution and diversity in Bacillus anthracis as detected by amplified fragment length polymorphism markers.
J. Bacteriol.
179:818-824[Abstract/Free Full Text].
|
| 11.
|
Keim, P.,
A. M. Klevytska,
L. B. Price,
J. M. Schupp,
G. Zinser,
K. L. Smith,
M. E. Hugh-Jones,
R. Okinaka,
K. K. Hill, and P. J. Jackson.
1999.
Molecular diversity in Bacillus anthracis.
J. Appl. Microbiol.
87:215-217[CrossRef][Medline].
|
| 12.
|
Keim, P.,
L. B. Price,
A. M. Klevytska,
K. L. Smith,
J. M. Schupp,
R. Okinaka,
P. Jackson, and M. E. Hugh-Jones.
2000.
Multiple-locus variable-number tandem repeat analysis reveals genetic relationships within Bacillus anthracis.
J. Bacteriol.
182:2928-2936[Abstract/Free Full Text].
|
| 13.
|
Kyte, J., and R. F. Doolittle.
1982.
A simple method for displaying the hydropathic character of a protein.
J. Mol. Biol.
157:105-132[CrossRef][Medline].
|
| 14.
|
Madoff, L. C.,
J. L. Michel,
E. W. Gong,
D. E. Kling, and D. L. Kasper.
1996.
Group B streptococci escape host immunity by deletion of tandem repeat elements of the alpha C protein.
Proc. Natl. Acad. Sci. USA
93:4131-4136[Abstract/Free Full Text].
|
| 15.
|
Makino, S.,
J. P. M. Van Putten, and T. F. Meyer.
1991.
Phase variation of the opacity outer membrane protein controls invasion by N. gonorrhoeae into human epithelial cells.
EMBO J.
10:1307-1315[Medline].
|
| 16.
|
Moran, C. P.
1993.
RNA polymerase and transcription factors, p. 653-667.
In
A. L. Sonenshein, J. A. Hoch, and R. Losick (ed.), Bacillus subtilis and other gram-positive bacteria: biochemistry, physiology, and molecular genetics 1993. American Society for Microbiology, Washington, D.C.
|
| 17.
|
Moxon, E. R.,
P. B. Rainey,
M. A. Nowak, and R. E. Lenski.
1994.
Adaptive evolution of highly mutable loci in pathogenic bacteria.
Curr. Biol.
4:24-33[CrossRef][Medline].
|
| 18.
|
Moxon, E. R., and P. B. Rainey.
1995.
Pathogenic bacteria: the wisdom of their genes, p. 255-268.
In
B. A. M. Van der Zeijst, L. Van Alphen, W. P. M. Hoekstra, and J. D. A. van Embden (ed.), Ecology of pathogenic bacteria. Royal Dutch Academy of Sciences, second series, no. 96. Royal Dutch Academy of Sciences, Amsterdam, The Netherlands.
|
| 19.
|
Murphy, G. L.,
T. D. Connell,
D. S. Barritt,
M. Koomeyh, and J. G. Cannon.
1989.
Phase variation of gonococcal protein. II. Regulation of gene expression by slipped strand mispairing of a repetitive DNA sequence.
Cell
56:539-547[CrossRef][Medline].
|
| 20.
|
Price, L. B.,
M. Hugh-Jones,
P. J. Jackson, and P. Keim.
1999.
Genetic diversity in the protective antigen gene of Bacillus anthracis.
J. Bacteriol.
181:2358-2362[Abstract/Free Full Text].
|
| 21.
|
Schupp, J. M.,
L. B. Price,
A. Klevytska, and P. Keim.
1999.
Internal and flanking sequence from AFLP fragments using ligation-mediated suppression PCR.
BioTechniques
26:905-912[Medline].
|
| 22.
|
Shields, D. C., and P. M. Sharp.
1987.
Synonymous codon usage in Bacillus subtilis reflects both translational selection and mutational biases.
Nucleic Acids Res.
15:8023-8040[Abstract/Free Full Text].
|
| 23.
|
Stern, A., and T. F. Meyer.
1987.
Common mechanism controlling phase and antigenic variation in pathogenic neisseriae.
Mol. Microbiol.
1:5-12[Medline].
|
| 24.
|
Taylor, W. R.
1986.
The classification of amino acid conservation.
J. Theor. Biol.
119:205-218[CrossRef][Medline].
|
| 25.
|
Titball, R. W.,
P. C. B. Turnbull, and R. A. Hutson.
1991.
The monitoring and detection of Bacillus anthracis in the environment.
J. Appl. Bacteriol. Symp. Suppl.
70:9S-18S.
|
| 26.
|
Van Belkum, A.,
S. Scherer,
L. Van Alphen, and H. Verbrugh.
1998.
Short-sequence DNA repeats in prokaryotic genomes.
Microbiol. Mol. Biol. Rev.
62:275-293[Abstract/Free Full Text].
|
| 27.
|
Van Putten, J. P. M.
1993.
Phase variation of lipopolysaccharide directs interconversion of invasive and immuno-resistant phenotypes of N. gonorrhoeae.
EMBO J.
12:4043-4051[Medline].
|
| 28.
|
Weiser, J. N.,
D. J. Maskell,
P. D. Butler,
A. A. Lindberg, and E. R. Moxon.
1990.
Characterization of repetitive sequences controlling phase variation of Haemophilus influenzae lipopolysaccharide.
J. Bacteriol.
172:3304-3309[Abstract/Free Full Text].
|
Journal of Bacteriology, July 2000, p. 3989-3997, Vol. 182, No. 14
0021-9193/00/$04.00+0
Copyright © 2000, American Society for Microbiology. All rights reserved.
This article has been cited by other articles:
-
Ko, K. S., Kim, J.-W., Kim, J.-M., Kim, W., Chung, S.-i., Kim, I. J., Kook, Y.-H.
(2004). Population Structure of the Bacillus cereus Group as Determined by Sequence Analysis of Six Housekeeping Genes and the plcR Gene. Infect. Immun.
72: 5253-5261
[Abstract]
[Full Text]
-
Sylvestre, P., Couture-Tosi, E., Mock, M.
(2003). Polymorphism in the Collagen-Like Region of the Bacillus anthracis BclA Protein Leads to Variation in Exosporium Filament Length. J. Bacteriol.
185: 1555-1563
[Abstract]
[Full Text]