Previous Article | Next Article ![]()
Journal of Bacteriology, November 2006, p. 7914-7921, Vol. 188, No. 22
0021-9193/06/$08.00+0 doi:10.1128/JB.00802-06
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
,2 and
Christina Schäffer1*
Zentrum für NanoBiotechnologie, Universität für Bodenkultur Wien, A-1180 Wien, Austria,1 Institut für Medizinische Physik und Biophysik, Universität Münster, D-48149 Münster, Germany,2 Sequenom GmbH, D-22761 Hamburg, Germany3
Received 6 June 2006/ Accepted 24 August 2006
|
|
|---|
2)-
-L-Rhap-(1
3)-ß-L-Rhap-(1
2)-
-L-Rhap-(1
] repeating units, with a 2-O-methyl modification of the terminal trisaccharide at the nonreducing end of the glycan chain and a core saccharide as linker to the S-layer protein. On sodium dodecyl sulfate-polyacrylamide gels, four bands appear, of which three represent glycosylated S-layer proteins. In the present study, nanoelectrospray ionization time-of-flight mass spectrometry (MS) and infrared matrix-assisted laser desorption/ionization orthogonal time-of-flight mass spectrometry were adapted for analysis of this high-molecular-mass and water-insoluble S-layer glycoprotein to refine insights into its glycosylation pattern. This is a prerequisite for artificial fine-tuning of S-layer glycans for nanobiotechnological applications. Optimized MS techniques allowed (i) determination of the average masses of three glycoprotein species to be 101.66 kDa, 108.68 kDa, and 115.73 kDa, (ii) assignment of nanoheterogeneity to the S-layer glycans, with the most prevalent variation between 12 and 18 trisaccharide repeating units, and the possibility of extension of the already-known
3)-
-L-Rhap-(1
3)-
-L-Rhap-(1
core by one additional rhamnose residue, and (iii) identification of a third glycosylation site on the S-layer protein, at position threonine-590, in addition to the known sites threonine-620 and serine-794. The current interpretation of the S-layer glycoprotein banding pattern is that in the 101.66-kDa glycoprotein species only one glycosylation site is occupied, in the 108.68-kDa glycoprotein species two glycosylation sites are occupied, and in the 115.73-kDa glycoprotein species three glycosylation sites are occupied, while the 94.46-kDa band represents nonglycosylated S-layer protein. |
|
|---|
The glycosylated surface layer (S-layer) protein SgsE from Geobacillus stearothermophilus NRS 2004/3a is a promising model for studies on prokaryotic glycosylation because of the homopolymeric nature of the glycan, which is composed only of rhamnose residues (35). In sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) analysis, the mature S-layer glycoprotein is separated into four bands. Three of them represent broad bands in the molecular mass range of approximately 119 to 170 kDa which also give a positive periodic acid-Schiff (PAS) staining reaction, indicating the presence of covalently linked glycan chains (7, 14, 35). The 93-kDa band is nonglycosylated, and the estimated molecular mass concurs with the calculated mass derived from the amino acid sequence of the mature structural protein, SgsE, after cleavage of the 30-amino-acid signal peptide from the precursor protein (GenBank accession number AF328862) (35). Dependent on the cultivation conditions of the organism (batch versus continuous culture), variations exist in the degree of glycosylation of the individual protein bands, all of which possess identical N termini indicative of identical protein portions (20, 35). Previous nuclear magnetic resonance (NMR) experiments have demonstrated that only one type of glycan chain is present on the individual glycoprotein species, however, with considerable length variations. The S-layer glycans consist of trisaccharide repeats with the structure [
2)-
-L-Rhap-(1
3)-ß-L-Rhap-(1
2)-
-L-Rhap-(1
]n and of a short core saccharide consisting of
-1,3-linked L-Rhap residues, attached to carbon 3 of a ß-D-galactose residue that serves as the linkage sugar to the S-layer polypeptide backbone (35). So far, two glycosylation sites for O-linked glycans have been determined after a papain degradation experiment of SgsE glycoprotein, namely, amino acids threonine-620 and serine-794 (with the numbers referring to the positions on the precursor protein) (35). These data, however, cannot satisfactorily explain the existence of three broad S-layer glycoprotein bands on SDS-PAGE gels.
To better understand the glycosylation pattern of the S-layer glycoprotein of G. stearothermophilus NRS 2004/3a, we have investigated the intact, glycosylated S-layer protein SgsE and selected glycopeptides derived thereof using different mass spectrometric approaches. These data will allow interpretation of S-layer protein glycosylation in a more general way, because similar SDS-PAGE patterns of S-layer glycoproteins have also been observed with other organisms (24).
|
|
|---|
General methods. SDS-PAGE was carried out according to a standard protocol (15) using a Protean II electrophoresis apparatus (Bio-Rad Laboratories, Vienna, Austria). Protein bands were visualized with Coomassie blue R-250 staining reagent. The protein concentration was determined using the Bradford reagent (Bio-Rad) (4). PAS staining for carbohydrates was performed according to the methods of Hart and coworkers (11).
Preparation of S-layer glycoprotein and glycopeptide samples. S-layer glycoprotein was isolated according to a standard procedure (21). The lyophilized protein was further purified on a Sephacryl S-200 column (1.6 by 60 cm; GE Healthcare, Uppsala, Sweden) using 2 M guanidinium hydrochloride (GHCl) in 50 mM Tris-HCl buffer, pH 7.5, as eluent. After extensive dialysis, first against 15 mM CaCl2 and then against distilled water, the protein was lyophilized and stored.
After degradation with papain (Sigma-Aldrich, Vienna, Austria), S-layer glycopeptides were isolated and purified as described elsewhere (35). Individual glycopeptide fractions were dried in a SpeedVac centrifuge and stored at 20°C.
ß-Elimination and alkyl-aminylation of the S-layer glycoproteins and glycopeptides. A 200-pmol aliquot of S-layer glycoprotein (corresponding to approximately 20 µg) or 500 pmol of glycopeptides was dissolved in 50 to 100 µl of 50% aqueous isopropylamine (Sigma-Aldrich) and incubated at 50°C for 18 h (10). The amine was evaporated at 50°C in a fume hood. The protein was washed twice with 10% ammonia to evaporate residual amines followed by two washes with distilled water and dried in a SpeedVac centrifuge. The dried protein was resuspended in distilled water by ultrasonication and proteolytically digested. Routinely, 200 pmol of S-layer protein was treated with 0.2 µg of trypsin or chymotrypsin (both from Roche Diagnostics, Mannheim, Germany) in a final volume of 20 µl of 10 mM NH4HCO3 solution. Digestion was carried out at 37°C for 18 to 36 h. Peptides were desalted prior to mass spectrometric analysis on ZipTip clean-up columns (Millipore, Eschborn, Germany) according to the manufacturer's instructions.
Release of the O-linked glycan. The reductive release of glycans was performed according to a protocol of Huang and coworkers (13), with minor modifications. A 20-µg aliquot of glycoprotein was dissolved in 10 µl of a solution containing 5 mg of NaBH4 in 1 ml of 28% ammonia and incubated at 50°C for 18 h. After drying the sample in a SpeedVac centrifuge and redissolving it in distilled water, the released oligosaccharides were purified on house-made carbon columns (29).
MS analyses. (i) Orthogonal TOF-MS.
The orthogonal time-of-flight mass spectrometry (TOF-MS) apparatus is a modified prototype machine which was described recently (9). For infrared matrix-assisted laser desorption/ionization mass spectrometry (IR-MALDI-MS), an Er:YAG laser (Speser, Spektrum Laser, Berlin, Germany) emitting pulses of
100-ns duration at a wavelength of 2.94 µm was used. Analysis was performed in positive ion mode, and mass spectra were processed using the MoverZ3 software (version 2001.02.13; Genomic Solutions, Ann Arbor, MI). The S-layer protein (50 pmol/µl) was dissolved in 8 M urea, and 0.5 µl of the respective sample was premixed with glycerol and then spotted on a glass plate. Mixtures with different sample-to-matrix ratios (5:1, 1:1, 1:5, 1:10, and 1:50 [vol/vol]) were examined. A modified MALDI sample plate containing a 1.2-mm-thick groove was used to accommodate the glass plate.
(ii) ESI-QTOF. Positive ion mode (+) MS of the (glyco)peptides and oligosaccharides was carried out by use of an electrospray ionization-quadrupole TOF (ESI-QTOF) instrument (Micromass, Manchester, United Kingdom). A house-made capillary puller was used for the production of nanospray glass capillaries. Capillary voltage applied on an internal wire electrode was 1,100 V; the cone voltage used for peptide analysis was 40 V, for oligosaccharides it was 60 V, and for glycopeptides it was 40 to 90 V. Peptide sequencing was conducted using argon as the collision gas. The applied collision energy was 15 to 40 V. The ß-eliminated samples were dissolved in water, yielding a concentration of the stock solution of 20 pmol/µl. Four microliters of that solution was mixed with 5 µl of methanol and 1 µl of 10% formic acid. The final concentration of the ß-eliminated samples was 8 pmol/µl. The final concentration of glycopeptides and glycan chains varied between 4 and 20 pmol/µl in 50% methanol containing various concentrations of formic acid (0.1 to 10%, vol/vol).
|
|
|---|
![]() View larger version (71K): [in a new window] |
FIG. 1. SDS-PAGE analysis of the S-layer glycoprotein of G. stearothermophilus NRS 2004/3a (8% gel). Lane 1, molecular mass standard (AllBlue Precision Plus Protein standard; Bio-Rad); lanes 2 and 4, S-layer glycoprotein derived from continuous culture; lanes 3 and 5, S-layer glycoprotein derived from batch culture; lanes 2 and 3, Coomassie blue staining; lanes 4 and 5, periodic acid-Schiff staining. The amounts of total protein loaded per gel lane were 10 µg in lanes 2 and 3 and 25 µg in lane 4 and 5. The arrow indicates the nonglycosylated S-layer protein protomer.
|
For analysis of the three S-layer glycoprotein species of G. stearothermophilus NRS 2004/3a, different MS approaches have been pursued. In MALDI-MS sample preparation, the solubilized sample is mixed with various matrices and loaded onto a metal target. Since S-layers usually are poorly soluble in water, solubilization is frequently performed using GHCl (25). This solubilization procedure, however, is not directly compatible with MS analysis. Allmaier and coworkers have developed a method for investigations on the S-layer glycoprotein of Thermoanaerobacterium thermosaccharolyticum E207-71 in which GHCl has been almost completely removed from the target by gentle washing with water (1). However, this procedure did not work for the S-layer glycoprotein of G. stearothermophilus NRS 2004/3a. Urea, another chaotropic agent, has also been reported for solubilization of S-layer proteins (25); notably, this agent has previously been successfully used as a matrix in IR-MALDI (28).
For optimizing the IR-MALDI conditions for the analysis of the S-layer glycoprotein of G. stearothermophilus NRS 2004/3a derived from batch culture, different ratios of urea to glycerol have been tested. Glycerol is widely used as a matrix for IR-MALDI-MS, e.g., for the analysis of large proteins (2). By diluting the sample, dissolved in 8 M urea, with glycerol in a ratio of 1:5 and 1:10 (vol/vol), acquisition of spectra was possible. The other concentrations (5:1, 1:1, and 1:50 [vol/vol]) only produced at most minor analyte intensities. Figure 2 displays the mass spectrum acquired from the 1:10-diluted sample. The average masses of the four ion signals were determined to be 94.46, 101.66, 108.68, and 115.73 kDa, respectively. Taking into account that both urea and glycerol are known to form analyte-matrix adducts with the sample, which results in an increase of the apparent mass of the protein, and the reduced mass resolution in the high-mass range, the average mass of the first ion peak (94.46 kDa) is in good agreement with the calculated theoretical molecular mass of the mature S-layer protein (93.68 kDa; mass deviation, 0.83%). The average mass differences between two neighboring peaks of the singly charged ions were calculated to be 7.09 kDa, which corresponds to a glycan chain composed of 15 tri-rhamnose repeating units with an average molecular mass of 7.05 kDa (mass deviation, 0.57%). Relation of these results to the SDS-PAGE evidence, where three S-layer glycoprotein species are visible, supports the hypothesis of three glycosylation sites on the SgsE protein of G. stearothermophilus NRS 2004/3a. The first peak (94.46 kDa) would originate from the nonglycosylated protein, the second peak (101.66 kDa), differing in the mass of a single glycan chain, implies a single glycosylation site, the third peak (108.68 kDa) corresponds to SgsE with two occupied glycosylation sites, and the fourth peak (115.73 kDa) would represent triply glycosylated SgsE.
![]() View larger version (11K): [in a new window] |
FIG. 2. (+)IR-MALDI-orthogonal TOF mass spectrum of the S-layer glycoprotein of G. stearothermophilus NRS 2004/3a. The spectrum shows the average mass of the nonglycosylated species of SgsE at 94.46 kDa and those of the three inherently heterogeneic glycoprotein species at 101.66 kDa, 108.68 kDa, and 115.73 kDa.
|
In the present study, ß-elimination followed by Michael addition of isopropylamine was used to replace the glycan chains at the respective glycosylation sites of the S-layer glycoprotein by an alkylamine. After proteolytic digestion, the peptides were analyzed by nano-ESI-QTOF tandem MS (MS/MS). The alkylaminylated peptides were identified by the observed mass shift in comparison to the theoretical mass of the nonglycosylated peptide. By this method, however, only the serine glycosylation site could be unambiguously identified. Threonine is known to be less reactive under alkaline conditions due to the presence of a methyl group, which might protect the ß-carbon against nucleophilic attack (5, 17, 32). On the other hand, stronger alkaline reaction conditions give rise to a number of side reactions, which make the unequivocal identification of the glycosylation site(s) impossible (38).
Despite these drawbacks, we succeeded in determining the position of the third glycosylation site by analyzing various glycopeptide fractions available from a previous papain digest (35) by nano-ESI-QTOF, either directly or after ß-elimination. In one fraction, a new glycopeptide species was detected, but the abundance of the respective high-molecular-mass glycopeptide ions [M + 2H+Na+K]4+ with m/z = 2,102.96 and m/z = 2,212.31, corresponding to the theoretical average mass of the peptide TFDEEVTTGSNITVVQ with 14 and 15 repeating units, respectively, was too low to perform MS/MS (data not shown). When choosing cone conditions, which favor in-source fragmentation, a short glycopeptide with one galactose and one rhamnose residue attached to the peptide moiety was isolated as doubly charged ions with m/z of 1,024.68 and subjected to a virtual MS3 experiment (Fig. 3A). In the resulting spectrum, the loss of one rhamnose residue followed by the loss of one galactose residue was clearly displayed by the doubly charged ions with m/z of 951.65 and 870.59, respectively. The identity of the peptide portion was proven by a number of y and b ions and by sequence comparison with the already-known sequence of SgsE. Unfortunately, loss of the glycan is the first fragmentation event to occur on the glycopeptide. Thus, due to the presence of four threonine residues and one serine residue in the peptide, an unambiguous identification of the glycosylation site was impossible. The same glycopeptide fraction was also subjected to ß-elimination with isopropylamine. Several doubly charged peptides resulting from complete and incomplete Michael addition of isopropylamine to the peptide TFDEEVTTGSNITVVQ were detected as nonsodiated and sodiated species. In Fig. 3B, the CID spectrum derived from the ion [M + H+Na-H2O]2+ with m/z of 872.53 is displayed; the additionally obtained CID spectra of the peptide ions [M + 2H+IA]2+ with m/z of 891.08 and [M + 2H-H2O]2+ with m/z of 861.55 are not shown. The position of the glycosylation site has been determined by the increment mass differences of 83.08 and 83.06 Da between ions y8# and y9# and b7 and b8, respectively. This shift corresponds to the mass of a ß-methyldehydroalanine residue replacing a threonine residue. ß-Methyldehydroalanine is the reaction product of the ß-elimination reaction of threonine without Michael addition of isopropylamine. Except for the N-terminal amino acid, full sequence coverage of that peptide has been achieved by MS/MS. By comparing the analyses of both the glycopeptide and the ß-eliminated peptide, the third glycosylation site could be unambiguously demonstrated to be threonine-590 of the SgsE precursor of G. stearothermophilus NRS 2004/3a.
![]() View larger version (26K): [in a new window] |
FIG. 3. (A) (+)Nano-ESI-QTOF MS/MS spectrum of the doubly charged ions at m/z = 1,024.68, representing glycopeptide TFDEEVTTGSNITVVQ with one galactose and one rhamnose attached to the peptide backbone. (B) (+)Nano-ESI-QTOF MS/MS spectrum of the doubly charged ions at m/z = 872.53 originating from glycopeptide TFDEEVTTGSNITVVQ after ß-elimination. The glycosylation site is displayed by the y9 ion and the b8 ion and a loss of water from the threonine-590 in the sequence of SgsE. #, sodiated ions.
|
-1,3-linked sugar residue. The distribution of chain length was confirmed by the analysis of several purified glycopeptides derived from a previous papain digest of the S-layer glycoprotein (35). Nano-ESI-QTOF-MS analysis of the glycopeptides revealed charge states from +4 to +6. Different combinations of charge-specifying adducts with hydrogen, sodium, and potassium generate multiple peaks of the same charge state. Considering these different charge-specifying adducts, the masses derived from the experimentally acquired data were calculated and compared with the theoretical average masses, e.g., for the glycopeptide species with the sequence ATLTSADVIRVD (Table 2). The mass spectrum of the glycopeptide ATLTSADVIRVD carrying a single glycan chain with a distribution of 13 to 17 repeating units and furthermore with an additional rhamnose is displayed in Fig. 5. |
View this table: [in a new window] |
TABLE 1. Nano-ESI-QTOF MS analysis of oligosaccharides released from the S-layer glycoprotein of G. stearothermophilus NRS 2004/3aa
|
![]() View larger version (12K): [in a new window] |
FIG. 4. (+)Nano-ESI-QTOF mass spectrum of the released glycan chains of the S-layer glycoprotein from G. stearothermophilus NRS 2004/3a. The main peaks correspond to [M + 2H+2K]4+. Chain lengths vary between of 12 and 17 repeating units (RU) for both the glycan structure with two core rhamnoses (first series; RU = X) and the one with three core rhamnoses (second series; RU = X + Rha).
|
|
View this table: [in a new window] |
TABLE 2. NanoESI-QTOF MS analysis of the S-layer glycopeptide ATLTSADVIRVD of G. stearothermophilus NRS 2004/3aa
|
![]() View larger version (15K): [in a new window] |
FIG. 5. (+)Nano-ESI-QTOF mass spectrum of glycopeptides belonging to the peptide ATLTSADVIRVD derived from the S-layer glycoprotein of G. stearothermophilus NRS 2004/3a. The most abundant peaks derive from overlapping peaks of the two molecular ion series [M + 3H+K]4+ and [M + 2H+2Na]4+; all glycopeptides carry a single glycan chain attached to serine-794. The number of repeating units (RU) varies between 13 and 17.
|
-1,3-linked rhamnose residue of the terminating trisaccharide repeating unit at the nonreducing end is O-methylated at carbon 2 (35), it is more likely that the newly identified additional rhamnose residue represents core variation rather than structural variation at the nonreducing end of the glycan chain. This further implies that in the S-layer glycoprotein glycan structure of G. stearothermophilus NRS 2004/3a, the already-known core region
3)-
-L-Rhap-(1
3)-
-L-Rhap-(1
(35) can be optionally extended by one additional rhamnose residue. Core variability has already been observed in S-layer glycoprotein glycans of Aneurinibacillus thermoaerophilus DSM 10155 (44) and A. thermoaerophilus GS4-97 (34). Conclusions. In general, S-layer glycoproteins are regarded as promising tools for nanobiotechnological applications (33, 37) because (i) they represent natural protein self-assembly systems, (ii) they can be tuned for distinct purposes through the addition of functional peptide or protein domains by genetic engineering methods, and (iii) naturally occurring glycosylation sites and attached S-layer glycan chains, which may be eventually rationally modified for certain applications by carbohydrate engineering, add a new and very valuable dimension to this S-layer protein-based molecular construction kit. A detailed understanding of the naturally occurring S-layer glycoprotein is a prerequisite for the envisaged nanobiotechnological applications of S-layer glycoproteins (e.g., carbohydrate vaccines or receptor mimics).
In the present study, a thorough analysis of an S-layer protein glycosylation pattern has been performed using G. stearothermophilus NRS 2004/3a as a model system. This organism has been chosen because the structural and biosynthetic knowledge about its S-layer glycoprotein is most advanced in our laboratory (20, 27, 35). So far, the lack of adequate analytical techniques has prevented a conclusive interpretation of the multiple banding pattern of this S-layer glycoprotein as observed on SDS-PA gels. Optimization and adaptation of MS methods to the water-insoluble S-layer glycoprotein (this study) allowed (i) determination of the average masses of the three inherently heterogeneic glycoprotein species of SgsE to be 101.66 kDa, 108.68 kDa, and 115.73 kDa, corresponding to SgsE with different numbers of attached glycan chains, (ii) clear assignment of nanoheterogeneity to each glycan chain, with each of them revealing the most prevalent variation between 12 and 18 trisaccharide repeating units and the possibility of extension of the already-known di-rhamnose core region by one additional rhamnose residue, and (iii) unambiguous identification of a third glycosylation site on the 93-kDa SgsE S-layer protein, namely, at position threonine-590, in addition to the known sites of threonine-620 and serine-794. These data lead to the current interpretation that in the 101.66-kDa glycoprotein species only one glycosylation site is occupied, in the 108.68-kDa glycoprotein species two glycosylation sites are occupied, and in the 115.73-kDa glycoprotein species three glycosylation sites are occupied. Future efforts will be directed towards identifying which glycosylation sites are used in the 101.66-kDa and in the 108.68-kDa glycoprotein species.
These data clearly support the high in vivo potential for diversification of bacterial S-layer glycoproteins in general.
This work was supported by the Austrian Science Fund, projects P15840-B10 and P18013-B10 (to P.M.), the scholarship "International Communication" of the Österreichische Forschungsgemeinschaft (to K.S.), and the scholarship "Stipendium für kurzfristige wissenschaftliche Arbeiten im Ausland" of the Universität für Bodenkultur Wien (to K.S.).
Published ahead of print on 8 September 2006. ![]()
|
|
|---|
, J. Peter-Katalini
, F. Hillenkamp, and S. Berkenkamp. 2005. Analysis of gangliosides directly from thin-layer chromatography plates by infrared matrix-assisted laser desorption/ionization orthogonal time-of-flight mass spectrometry with a glycerol matrix. Anal. Chem. 77:4098-4107.[Medline]
. 2001. Glycoprotein identification and localization of O-glycosylation sites by mass spectrometric analysis of deglycosylated/alkylaminylated peptide fragments. Anal. Biochem. 290:47-59.[CrossRef][Medline]
, J. 2005. O-glycosylation of proteins. Methods Enzymol. 405:139-171.[Medline]This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»