Previous Article | Next Article ![]()
Journal of Bacteriology, August 2008, p. 5681-5689, Vol. 190, No. 16
0021-9193/08/$08.00+0 doi:10.1128/JB.00254-08
Copyright © 2008, American Society for Microbiology. All Rights Reserved.
,
U.S. Department of Agriculture, Agricultural Research Service, Produce Safety and Microbiology Research Unit, Albany, California,1 Institute for Biological Sciences, National Research Council Canada, 100 Sussex Dr., Ottawa, Ontario K1A 0R6, Canada,2 GBS Laboratory, Tokyo 145-0064, Japan,3 Department of Medical Microbiology and Infectious Diseases, Erasmus MC, University Medical Center Rotterdam, 3015 GD Rotterdam, The Netherlands,4 International Centre for Diarrhoeal Disease Research, Bangladesh, GPO Box 128, Dhaka 1000, Bangladesh5
Received 27 March 2008/ Accepted 6 June 2008
|
|
|---|
|
|
|---|
The LOS is a surface-exposed molecule, and variability of the C. jejuni LOS may have arisen as a result of selection for antigenic diversity to evade the immune system of its various hosts. Variation in LOS structures is due to the diversity of monosaccharide components and the linkages between them and the derivatization of the monosaccharides with noncarbohydrate moieties. The formation of these linkages is determined by genes encoding glycosyltransferases and other transferases for the addition of various moieties located in the LOS biosynthesis locus. Also, the monosaccharides available are often determined by genes in the LOS biosynthesis locus that are involved in the synthesis of sugar intermediates such as CMP-N-acetylneuraminic acid (7, 25).
The identification of the LOS biosynthesis locus in NCTC 11168 facilitated the cloning and characterization of LOS biosynthesis genes from other C. jejuni strains involved in the transfer of galactose, N-acetylgalactosamine, and sialic acid to the LOS outer core (7, 12, 13). The LOS biosynthesis region was identified as a hypervariable region within C. jejuni strains by whole-genome microarray analyses (5, 23, 24, 29, 30, 33), and sequencing of the LOS biosynthesis loci from several C. jejuni strains revealed differences in gene content and organization (7, 10, 12, 13, 28). In addition, comparisons of LOS structures and the corresponding DNA sequences of the LOS biosynthesis loci demonstrated that the structural diversity can be the consequence of either major or minor genetic differences at the LOS biosynthesis loci of the strains (9). Eight LOS biosynthesis locus classes were defined previously based on major genetic differences, gene content, and organization. Three of these LOS locus classes—A, B, and C—encode genes responsible for the production of sialylated LOS that are ganglioside mimics (10), while five other loci (D to H) lack a cst gene that encodes a sialic acid transferase (6, 9, 28). Sequence analysis also revealed minor genetic variation between C. jejuni strains that resulted in major LOS structural differences between strains that possessed the same LOS locus classes (8, 10, 11, 13). These minor genetic variations included (i) phase-variable homopolymeric tracts, (ii) gene inactivation by the deletion or insertion of a single base (without phase variation), (iii) missense or nonsense mutations leading to the inactivation of a glycosyltransferase, and (iv) single or multiple missense mutations leading to "allelic" glycosyltransferases (9-11, 28).
Moreover by using a PCR-based screening of LOS gene content, ca. 80% of the LOS loci from more than 100 C. jejuni strains were assigned into these eight distinct locus classes (A through H) (28). In the present study, we report the sequencing and characterization of 20 additional C. jejuni LOS outer core biosynthesis loci between waaM (Cj1134 in NCTC 11168) and waaV (Cj1146c in NCTC 11168), including 15 with an unknown LOS locus class. Based on gene content and organization, we identified 11 novel LOS locus classes, and each of these newly characterized LOS locus classes possessed regions with some similarity in gene content and organization, as observed in previously described classes (A through H), suggesting high levels of recombination within the locus (28). In addition, we observed in these new classes many of the genetic mechanisms of variation previously observed with other C. jejuni LOS classes, including frameshift mutations and missense mutations. The importance of these findings with regard to the evolution of both recognition specificity and different LOS structures is also discussed.
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. C. jejuni strains used in this study
|
DNA sequencing, assembly, and analysis. The sequencing reactions were performed on a Tetrad Thermocycler using the ABI Prism BigDye terminator cycle sequencing kit (version 3.0; Applied Biosystems, Foster City, CA) and standard protocols as recommended by the manufacturer and then analyzed. All labeled products were purified on DyeEx spin columns (Qiagen, Valencia, CA). DNA sequencing was performed on an ABI Prism 3100 genetic analyzer (Applied Biosystems) using the POP-6 polymer and ABI Prism genetic analyzer data collection and sequencing analysis software. The DNA primers used for PCR or sequencing were designed by using Primer Premier 5.0 (Premier Biosoft International). PCR sequencing primers were purchased from either Qiagen or MWG-Biotech, Inc. (High Point, NC). Sequencing reads were trimmed and assembled by using Lasergene Seqman II (version 6.0; DNAstar, Madison, WI).
Nucleotide sequences were compared against the sequences of bacterial origin of the nonredundant DNA sequence NCBI database using the Basic Local Alignment Search Tool (BLAST) programs BLASTN and BLASTP analysis (1, 32) through the NCBI website (http://www.ncbi.nlm.nih.gov/BLAST/). Conserved domain searches were also performed during BLASTP analysis (26). For comparison of the different C. jejuni LOS loci, BLAST 2 (35) was used at the NCBI website (http://www.ncbi.nlm.nih.gov/BLAST/).
Examination of LOS. LOS samples were prepared from bacteria that were subjected to complete digestion with proteinase K as described by Hitchcock and Brown (14). LOS samples were separated on 16% Tricine gels (Invitrogen, Carlsbad, CA) and then silver stained (Bio-Rad, Hercules, CA). An O-deacylated LOS sample of C. jejuni GC175 was prepared as described previously (21) and analyzed by capillary electrophoresis-electrospray ionization-mass spectrometry.
|
|
|---|
Mosaic LOS classes related to LOS classes D and F. The majority of newly described LOS loci have gene content and organization similarities to classes D and F (Fig. 1). Indeed, classes D and F are quite similar to each other possessing four glycosyltransferase genes in common (orf18, orf19, orf20, and orf16). The classes D and F loci differ in that class D contains an orf3 encoding a one-domain glucosyltransferase, and this is followed by a glycosyltransferase gene (orf17). In contrast, class F possesses a larger orf3 that encodes a two-domain glucosyltransferase similar to Cj1135 from strain NCTC 11168 based on BLAST analysis. Previous structural analysis identified the two-domain product of orf3 as a glucosyltransferase that substitutes a glucose on both the LOS core heptoses, while the presence of the one-domain glucosyltransferase in class D and related classes suggests that only HepI is substituted with glucose (9).
![]() View larger version (16K): [in a new window] |
FIG. 1. LOS classes related to class D and F. Arrows represent ORFs. Genes colored white are common to all LOS classes. Genes colored blue are present in class D. The light blue-colored gene is present in class F. A "G" beneath a gene indicates the presence of an HGT. Gene size is not drawn to scale.
|
The LOS classes J (strains RM1163 and RM1508) and S (strain RM3419) appear to be related to class F due to the presence of the first four ORFs (orf3, orf18, orf19, and orf20) adjacent to waaM of class F (Fig. 1) (strain RM1170, GenBank accession no. AY434498). Indeed, the 4,676 nt containing these four ORFs in class F (strainRM1170, nt 1222 to 5898) show 99% identity to class J and show 96% identity to class S by pairwise alignment.
In the regions that differ from classes D and F, classes I, J, and S are structurally similar to each other in terms of gene order and gene content and provide evidence for a set of common insertion or deletion events. These classes diverge from classes D and F in that orf16 is deleted and replaced by five (class J) or six genes (classes I and S). Classes I and S contain six additional ORFs, and five of these (orf40, orf42, orf43, orf44, and orf45) show similarity to capsular biosynthesis genes from the HS:41 C. jejuni strain 176.83 (GenBank accession no. BX545857). The class J LOS locus also contains these five capsular biosynthesis ORFs. The orf40 encoding a glycosyltransferase has a putative length of 1,053 nt and shows 92% identity over a 643-nt span to HS41.29 from the HS:41 C. jejuni strain 176.83. The four additional HS:41-like ORFs (orf42, orf43, orf44, and orf45) encode a nucleotidyl-sugar pyranose mutase, a sugar epimerase, putative UDP-glucose 6-dehydrogenase, and nucleotidyl-sugar pyranose mutase, respectively. These genes are in the same order as the ORFs 41.24, 41.25, 41.26, and 41.27 from the Penner HS:41 capsular biosynthesis region of C. jejuni strain 176.83 (GenBank accession no. BX545857) (18). Indeed, the 4,094-nt region spanning these four ORFs in classes I, J, and S shows 95% identity to the capsular region from the HS:41 strain 176.83, suggesting that the whole gene cassette (orf42, orf43, orf44, and orf45) was transferred together. Other than the HS:41 genes, the class I and S LOS loci possess a sixth ORF, orf41, that shows similarity at the amino acid level to a number of putative group 1 glycosyltransferases from other bacteria (pfam00535). Considering that the 5' 279 nt of orf40j and orf41i(s) are 98% identical, it is likely that orf41 recombined into this region of orf40. Thus, it appears that an additional insertion event (orf41) gave rise to class S from class J. This six-gene cassette could then homologously recombine from class S to class D, giving rise to class I.
The remaining three LOS classes (class K, N, and Q) that have similarity to the class D LOS locus exhibit distinct insertion and deletion events. The class K LOS locus contains two additional ORFs that replace the class D orf20 and orf16. The first, orf49k, has 46% similarity to the capsular gene Cj1431c at the amino acid level (GenBank accession no. CAB73855), and orf50k has 73% similarity to orf30h from LOS class H at the amino acid level (GenBank accession no. AAW79071). However, this region shows little identity to any bacterial sequences by BLASTN analysis. It should be noted that the 3,473-nt region (nt 4294 to 7766 of EF143353, strain RM2227) containing these two genes and 100 nt on each side is more than 82% A+T-rich (data not shown), and this nucleotide content may be a factor in the recombination event. The class N LOS locus contains ORF (orf38n) that replaces orf19, orf20, and orf16. This ORF shows similarity to the RfaJ family of glycosyltransferases (COG1442) and has 86% similarity to orf38g from LOS class G at the amino acid level (GenBank accession no. AAR98510). The deletion of orf19 and orf20 can be attributed to a recombination event between orf18 and orf20, which have 93% nt identity over their first 300 nt. Indeed, evidence of such recombination is the presence of a homopolymeric G-tract (HGT) within this 300-nt region of orf18n that is generally found in orf20 of other LOS loci. Finally, the class Q LOS locus (strain RM3437) possesses all ORFs present in class D, with the addition of orf46q inserted between orf20q and orf16q. This ORF shows the conserved domain of group 1 glycosyltransferases (pfam00534) and the RfaG family (COG0438). The insertion altered the 3'-terminal end of orf16q and the 5'-terminal end of orf20q compared to orf16 and orf20 from classes D and F. The HGTs that generally are present in orf20 and orf16 genes are absent from these genes in the class Q LOS locus with the orf46q gene possessing an HGT.
Identification of two new LOS loci containing the cstII gene. The class A locus has the potential to synthesize sialylated LOS that are ganglioside mimics, and two LOS classes were identified that possessed portions of class A. The class M locus appears to be a mosaic of LOS classes A and D genes (Fig. 2). This locus resembles class A in the regions adjacent to waaV, containing a cassette of five genes. This cassette with genes encoding a sialic acid transferase (cstII; orf7) and enzymes involved in CMP-N-acetyl neuraminic acid biosynthesis (neuBCA: orf8, orf9, and orf10) (7, 25) could also allow the synthesis of a sialylated LOS. The organization of genes adjacent to waaM (orf2m) is different from all of the C. jejuni LOS loci previously examined. Inserted between waaM and the class D gene cassette of orf3 and orf17 is orf51, with 74% similarity to orf19 of class D and class F. This insertion appears to have caused a 9-nt truncation of the waaM gene, while the orf3 gene is not disrupted. It is not clear what recombination mechanism may be responsible for the insertion of orf51, but the regions where the insertion is likely to have occurred at the 3' end of waaM (nt 474 to 578) and 5' end before orf3m (nt 1752 to 1815) in RM1503 are ca. 90% A+T-rich (data not shown).
![]() View larger version (14K): [in a new window] |
FIG. 2. LOS classes M and R contain cstII. Arrows represent ORFs. Genes colored white are common to all LOS classes. Genes colored red are genes present in classes M and R that are similar to class A. Genes colored blue are genes present in classes M and R that are similar to class D. A "G" beneath a gene indicates the presence of an HGT.
|
LOS class related to LOS class G. The class L LOS locus in strain RM3435 exhibits regions of similarity to several other LOS classes. Class L possesses three ORFs (orf35l, orf36l, and orf37l) in the middle of the locus with 93% identity across the 2,790 nt to class G (GenBank accession no. AAR98510) (Fig. 3). Also, like the class G locus, the class L LOS locus contains an orf3 that encodes a one-domain glucosyltransferase, but this is more similar to orf3 from classes E, H, and O loci with 97% nt identity. There are two additional ORFs: orf47l and orf48l. Orf47 shows the conserved domain of group 2 glycosyltransferases (pfam00535) and the WcaA family (COG0463) and has 81% identity to the orf19 glycosyltransferase gene from LOS class K. Orf48 also shows the conserved domain of group 2 glycosyltransferases (pfam00535) and the WcaA family (COG0463). This ORF is formed by a fusion between the 5' terminus of orf16 and a capsule-related gene. Indeed, the first 705 nt of orf48 is 90% identical to the orf16 from the class F locus, and another 504 nt are 96% identical to the capsular biosynthesis gene HS19.08 from the HS:19 C. jejuni strain CJ12517 (GenBank accession no. BX545860). Elsewhere in class L, the region adjacent to orf3 is approximately 1.000 nt and appears to contain fragments of several genes. including a 390-nt span that is 88% similar to orf5/10 (Cj1143 in strain NCTC 11168) from class C; however, the origin of the remainder of this region is unknown.
![]() View larger version (10K): [in a new window] |
FIG. 3. Mosaic LOS class related to class G. Arrows represent ORFs. Genes colored white are common to all LOS classes. Genes colored pink are genes present in class L that are similar to class G. A "G" beneath a gene indicates the presence of an HGT.
|
![]() View larger version (11K): [in a new window] |
FIG. 4. Intermediate LOS classes between class E and class H. Genes colored white are common to all LOS classes. The solid-green and spotted-green genes are merely used to distinguish the genes in class H that are more similar to class O or P. The orange colored gene is present in classes P and H. A "G" beneath a gene indicates the presence of an HGT.
|
|
View this table: [in a new window] |
TABLE 2. Variants of the glycosyltransferases related to class D and class Fa
|
![]() View larger version (37K): [in a new window] |
FIG. 5. Effect of mutations on LOS structure. (A) Differences in HGT in orf20 leads to in-frame (IF) (strain RM3415), out-of-frame (OOF) (variant of strain RM1221), and variable (IF/OOF) (strain RM1221) states in class F. (B) A 13-nt deletion disrupts orf18 in class K strain RM1861 compared to strain RM2227 with an intact orf18.
|
![]() View larger version (33K): [in a new window] |
FIG. 6. LOS outer core structures of four class A C. jejuni (all Penner type HS:19) from Japanese patients. (A) The LOS outer core structures of OH4382, OH4384, and CF90-26 were reported previously (3, 22). The LOS outer structure of GC175 is proposed based on the mass spectrometry data presented in Table S1 in the supplemental material. (B) Comparison of the DNA sequences of the five glycosyltransferases involved in the biosynthesis of the LOS outer core. The identities are indicated between strains for each gene. The only difference between GC175 and CF90-26 is the deletion of an A in cgtA. This deletion of an A in cgtA is also the only difference between OH4382 and OH4384.
|
-2,3- and
-2,8-sialyltransferase activities) and monofunctional alleles (
-2,3-sialyltransferase activity). The alignment of the protein sequences showed alleles with 92% identity, and site-directed mutagenesis showed that residue Asn51 was critical for the bifunctional activity (10). However, a CstII variant with Asn51 and that has only
-2,3-sialyltransferase activity has recently been characterized (11). This CstII variant has diverged significantly from the other CstII sequences, and it is possible that one (or several) amino acid substitution(s) have inactivated the
-2,8-sialyltransferase activity in that variant. We examined the diversity of the glycosyltransferases encoded by orf18 and orf19 that are present in multiple LOS classes. From the alignment of the Orf18 glycosyltransferase amino acid sequences among 16 C. jejuni strains that possess the full-length ORF, 12 protein variants ranging from 84 to 99% identity were observed (see Fig. S1 in the supplemental material). It should be noted that a particular LOS class did not necessarily possess a particular variant. For example, although Orf18 from both class D strains had identical amino acid sequences, Orf18 in the four class F variants were distinct. Moreover, the Orf18 from RM1221 (class F) was identical to the Orf18 from strains RM1163 and RM1508 (class J). There was similar diversity among the 10 variants of the 14 Orf19 glycosyltransferases ranging from 87 to 99% identity (see Fig. S2 in the supplemental material). |
|
|---|
In the present study, we compared the sequences of the 19 distinct classes of LOS biosynthetic loci of C. jejuni, 11 of which were newly described here. The sequence analysis highlights genetic mechanisms that are utilized by C. jejuni to generate LOS diversity. The most obvious mechanism is the variation in gene content resulting in the 19 different LOS biosynthesis loci. Based on shared gene content, several of these loci are related genetically and appear to have arisen by the insertion and/or deletion of genes or gene cassettes. There is a group of four LOS biosynthesis loci that possess the necessary genes to synthesize sialylated LOS but differ by gene insertions or gene duplication LOS (classes A, B, M, and R; Fig. 3). In particular, each class possesses genes encoding a sialyltransferase (cstII), a sialic acid synthase (neuB), an N-acetylglucosamine-6-phosphate 2-epimerase (neuC), and a CMP-Neu5Ac synthetase (neuA). The class C locus also possesses genes to synthesize sialylated LOS, but it appears more distantly related to these others classes based on sequence comparison (9). There were three other groups of LOS biosynthesis loci that did not possess the genes necessary to synthesize sialylated LOS. Six newly identified LOS biosynthesis loci (I, J, K, N, Q, and S) possessed genes found in classes D and F (Fig. 1), and three of these loci possessed a gene cassette of four HS:41 capsular genes. Furthermore, LOS biosynthesis classes G and L possess orf3 and orf16 and could be distantly related to the class D locus. However, these two classes exhibit multiple insertion and deletion events that make parsimonious descriptions difficult. On the other hand, the organizations of classes E, H, O, and P LOS biosynthesis loci do allow parsimonious recombination event predictions that explain derivation of class H from the others in this class (Fig. 4). Together, each group of LOS biosynthetic loci demonstrates the mosaic nature of gene and gene cassette insertion and deletion that can result in LOS structure variation.
It has been observed recently that the introduction of a complete LOS biosynthesis locus class can occur between strains by horizontal transfer (8, 31). Presumably, these exchanges involve recombination between homologous regions that flank the LOS biosynthesis locus. Our sequence characterization of new classes of LOS loci demonstrates that recombination events often occur within the LOS locus, where no obvious regions of homology exist. Thus, the exact mechanism for the production of new LOS loci described would potentially involve recombination between the large number of A and T polynucleotides or specific A+T-rich sequences that may be common between these loci. Indeed, the G+C content of the LOS loci (22 to 28%) is slightly lower than the G+C content for the rest of the genome (30%) and is often even lower than 20% in regions where insertions have occurred.
It is also not clear whether all of the newly described mosaic LOS loci create functional biosynthetic gene clusters. It is quite possible that the newly imported glycosyltransferases or modification enzymes may not recognize the existing LOS structures. Also, it is possible that during recombination only portions of genes could be transferred or that resident genes could be disrupted. Certainly, there is evidence of gene disruption in LOS classes H and P where orf39 has disrupted orf26. Also, the event that inserted orf16 in strain GC149 altered the 3' terminal ends of the adjacent neuA gene and the incoming orf16 gene. In addition, it appears that the class L locus acquired only a portion of the orf5/10.
Aside from the differences in gene content, the sequence analysis also highlights the subtle genetic differences that are utilized by C. jejuni to generate LOS diversity. These include phase-variable HGTs, gene inactivation by the deletion of a single base or multiple bases (without phase variation), and missense mutations leading to "allelic" glycosyltransferases. Indeed, we demonstrated that OOF mutations in orf18, orf19, and orf20 (phase variable and non-phase variable) affected LOS structures (Fig. 5 and 6). Moreover, the importance of phase variation is highlighted by the fact that 18 of the 19 LOS biosynthetic loci examined thus far possess at least one HGT. As for the allelic glycosyltransferases, it is still to be determined whether any of the protein variants encoded by orf18 and orf19 exhibit different enzymatic specificities until the LOS structures for all of these strains are determined. Previously, different enzymatic specificities were observed for Orf7 (CstII) from classes A and B (10). There were alleles that were bifunctional (both
-2,3- and
-2,8-sialyltransferase activities) and alleles that were monofunctional (
-2,3-sialyltransferase activity). The alignment of the protein sequences showed that most of the bifunctional CstII alleles possessed the residue Asn51 (10, 11).
Finally, despite the ability to form a variety of LOS biosynthesis loci, we previously observed that over 60% of LOS biosynthesis loci from more than 100 clinical and environmental strains of C. jejuni belonged to classes A, B, or C (28). This suggests that possessing these particular loci that have the potential to synthesize a sialylated LOS may be advantageous in host interactions. It is also noteworthy that variation in other C. jejuni glycan structures (capsule and O-linked flagellar glycosylation) occurs by similar mechanisms (18, 19). Together, it points to the fact that despite our knowledge of locus sequences and glycan structural data, we still require a greater understanding of the significance of the C. jejuni glycan variability.
Part of this study was supported by the U.S. Department of Agriculture, Agricultural Research Service CRIS project 5325-42000-045 (to C.T.P. and R.E.M.) and by Human Frontier Science Program grant RGP 38/2003 (to M.G., N.Y., and H.P.E.).
Published ahead of print on 13 June 2008. ![]()
Supplemental material for this article may be found at http://jb.asm.org/. ![]()
|
|
|---|
-2, 8-linked sialic acid. J. Biol. Chem. 281:11480-11486.This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»