Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Feb 2;10(5):eadk8173.
doi: 10.1126/sciadv.adk8173. Epub 2024 Jan 31.

N-glycosylation as a eukaryotic protective mechanism against protein aggregation

Affiliations

N-glycosylation as a eukaryotic protective mechanism against protein aggregation

Ramon Duran-Romaña et al. Sci Adv. .

Abstract

The tendency for proteins to form aggregates is an inherent part of every proteome and arises from the self-assembly of short protein segments called aggregation-prone regions (APRs). While posttranslational modifications (PTMs) have been implicated in modulating protein aggregation, their direct role in APRs remains poorly understood. In this study, we used a combination of proteome-wide computational analyses and biophysical techniques to investigate the potential involvement of PTMs in aggregation regulation. Our findings reveal that while most PTM types are disfavored near APRs, N-glycosylation is enriched and evolutionarily selected, especially in proteins prone to misfolding. Experimentally, we show that N-glycosylation inhibits the aggregation of peptides in vitro through steric hindrance. Moreover, mining existing proteomics data, we find that the loss of N-glycans at the flanks of APRs leads to specific protein aggregation in Neuro2a cells. Our findings indicate that, among its many molecular functions, N-glycosylation directly prevents protein aggregation in higher eukaryotes.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.. Relative enrichment of different PTM types in APRs and GRs.
(A) Flow chart illustrating the preparation of the dataset. (B) Bar plot showing the relative enrichment (odds ratio) of all PTM sites in APRs, GRs, and DRs relative to the background (all protein regions). (C) Heatmap showing the relative enrichment for each of the 17 types of PTMs. Columns indicate the different protein regions, while rows show the PTM types. Rows are clustered on the basis of Pearson correlation as a distance measure. The number of observations for each PTM type can be found in table S1. Statistical significance was determined by Fisher’s exact test with false discovery rate correction [(B) and (C)]. Crosses and asterisks indicate that a region has a significantly lower (higher) frequency than the background. + and *P ≤ 0.05, ++ and **P ≤ 0.01, and +++ and ***P ≤ 0.001.
Fig. 2.
Fig. 2.. Functional assessment of N-glycosylation in APRs and GRs.
(A) Heatmap showing the relative enrichment of finding a sequon in each region (columns) for different groups of proteins (rows). Rows are clustered on the basis of Pearson correlation as a distance measure. The number of sequons observed in each group is indicated. (B) Ratio between the relative enrichments of glycosylated sequons versus nonglycosylated sequons. (C) Fraction of known SP proteins with at least one N-glycosite in EPs (yellow), with N-glycosites that are not in EPs (blue) or without N-glycosites (back). (D) Box plot showing the glycosylation efficiency of glycosylated sequons in EPs (yellow), rest of glycosylated sequons (blue), and nonglycosylated sequons (black) of human proteins. A.U., arbitrary units. (E) Box plot showing the conservation of human sequons in a set of 100 mammalian species for the same categories as (D). (F) Heatmap showing the relative enrichment of finding a sequon in each region (columns) for SP and non-SP proteins in five different eukaryotic species. For all species, the enrichment profiles of SP proteins are clustered together, while the same is true for non-SP proteins. Rows are clustered on the basis of Pearson correlation as a distance measure. The number of sequons observed in each group is indicated. Statistical significance was determined by Fisher’s exact test with false discovery rate correction [(A) and (F)] or by unpaired Wilcoxon test with Bonferroni correction for multiple comparisons [(D) and (E)]. Crosses and asterisks in the heatmaps indicate that a region has a significantly lower (higher) frequency than the background. + and *P ≤ 0.05, ++ and **P ≤ 0.01, and +++ and ***P ≤ 0.001.
Fig. 3.
Fig. 3.. N-glycosites at EPs behave as aggregation gatekeepers.
(A) Box plot showing the aggregation propensity (TANGO scores) of APRs that have glycosylated or nonglycosylated sequons for each EP. (B) Distribution of the number of unmodified gatekeeper residues flanking (three positions upstream and downstream) strong APRs that have glycosylated or nonglycosylated sequons for each EP. Strong APRs (TANGO score ≥ 50) were used to ensure that a high evolutionary pressure is acting on the APRs to mitigate their aggregation. (C) Subset of a multiple sequence alignment for the glycosylated site at position 439 of the BCAM protein. Glycosylated sequons and unmodified gatekeepers that are flanking or within the aligned APR are indicated. The positions of the human APR in the alignment are colored in gray. (D) Box plot showing the average number of unmodified gatekeepers flanking or within aligned APRs for all glycosylated and nonglycosylated human sequons at GR1 N-ter when these are conserved (orthologs with sequon) or not (orthologs without sequon) in 100 mammalian species. APRs are divided into two categories: weak if the TANGO score is <50 or strong if the TANGO score is ≥50. Red dots indicate the values for the BCAM site shown in (C). (E) Example of a serpin structure (alpha-1 antitrypsin; A1AT) obtained from AlphaFold. A1AT has an N-glycosite (orange) at the N-terminal flank of a very conserved APR (red). (F) Multiple sequence alignment showing the conserved APR (in gray) for intracellular and extracellular serpins. N-glycosylated sequons or unmodified gatekeepers, three residues upstream of the APR, are highlighted. Statistical significance was determined by unpaired Wilcoxon test with Bonferroni correction for multiple comparisons [(A) to (C)]. The number of glycosylated or nonglycosylated sequons in each region is indicated. *P ≤ 0.05, **P ≤ 0.01, and ***P ≤ 0.001. ns, not significant.
Fig. 4.
Fig. 4.. In vitro analysis of N-glycosylated peptides.
(A) Schematic representation of the peptide variants and experimental design. An aggregation core is flanked by either a nonglycosylated Asn (WT), GlcNAc, or Man9. (B and C) ThT binding (B) and pFTAA binding (C) kinetics of the SLNYLLYVSN peptide set (n = 3). Vehicle control fluorescence is shown in purple. (D) ThT binding after incubation with 1 μl (500 U) of Endo H enzyme, which cleaves the bond between two GlcNAc subunits directly proximal to the asparagine residue of the glycopeptide (n = 3). Vehicle control fluorescence is shown in purple. (E) Percentage of the concentration of peptide in the soluble fraction after ultracentrifugation for the SLNYLLYVSN peptide set (n = 3). Unpaired t test was used to assess significance. (F) TEM images for the SLNYLLYVSN peptide set after q days of incubation. (G) Combined results for all peptide sets. Peptides were classified on whether they showed or did not show kinetics based on ThT and pFTAA assays (marked with ticks or crosses, respectively) and whether they formed or did not form fibrillar aggregates detectable by TEM imaging (marked with ticks or crosses, respectively). (H) Percentage of soluble fraction for the charged residue variants [Asp (D), Glu (E), Lys (K), and Arg (R); n = 3]. Man9 values were reused from (E). Unpaired t test was used to assess significance against Man9. (I) Schematic representation of the structures of the different glycoforms analyzed. (J and K) Percentage of soluble fraction after ultracentrifugation for the nonglycosylated and glycoforms versions of NISCLWVFK (J) and SLNYLLYVSN (K) peptide sets (n = 3). Nonglycosylated and Man9 peptides values were reused from fig. S10 and from (E). N represents biological replicates. Bars represent means and error bars SD. *P ≤ 0.05, **P ≤ 0.01, and ***P ≤ 0.001.
Fig. 5.
Fig. 5.. N-glycosylation protects against aggregation in hard-to-fold proteins.
(A) A random forest classifier was built to predict which APRs in CATH domains are protected (with an N-glycosite at an EP) or unprotected (all others). (B) Variable importance plot for the predictive model built using random undersampling. Higher values indicate that a variable is more important for the model. Domain-specific variables are highlighted in bold, while APR-specific variables are not highlighted. aa, amino acid. (C) Left: Schematic representation and percentage of domains classified in three categories: with N-glycosites in EPs (yellow), with N-glycosites not in EPs (blue) and without N-glycosites (black). APRs are colored in red. Right: A two-dimensional density plot showing the relative contact order and the maximum aggregation propensity for domains in each category. (D) Average relative contact order of domains in each CATH architecture. The dotted line indicates the average value for domains containing an N-glycosite at an EP. (E) Relative enrichment of finding a protected APR in each CATH architecture. The number of protected APRs present in each architecture is indicated. (F) Heatmap showing the position of N-glycosites at EPs in β-sandwich domains with at least one protected APR. Domains are sorted by length and colored in gray. (G) Box plot showing the number of APRs per 100 amino acids in β-sandwich domains. The number of domains in each category is indicated. (H) Box plot showing the number of β-sandwich domains in proteins with at least one of these domains. The number of proteins in each category is indicated. (I) Quality control system of glycoproteins. (J) Fraction of UGGT substrates that have at least an N-glycosite in an EP compared to the same fraction in all glycoproteins. Statistical significance was determined by unpaired Wilcoxon test with Bonferroni correction for multiple comparisons (G and H). ***P ≤ 0.001.
Fig. 6.
Fig. 6.. Absence of N-glycosylation in Neuro2a cells specifically increases protein aggregation.
(A) A simplified overview of the process of glycan precursor synthesis. Tunicamycin (in red) completely blocks the enzymatic activity of UDP-N-acetylglucosamine—dolichyl-phosphate N-acetylglucosaminephosphotransferase (encoded by ALG7), eliminating the production of all glycan precursors and the complete inhibition of N-glycosylation. (B to F) Percentage of proteins that are enriched in the insoluble or soluble fraction in each protein group after ER stress (B), treatment with a proteasome inhibitor (C), treatment with an HSP70 inhibitor (D), treatment with an HSP90 inhibitor (E), or oxidative stress (F) relative to the number of proteins identified by MS in each protein group (background). The total number of proteins (background) in each group and stress is indicated. (G) During translocation, the OST can glycosylate proteins before these are folded. When N-glycans are attached at the flanks of an APR, they shield this region from aggregation, leading to a glycosylated functional protein. However, the absence of N-glycosylation, specifically at the flanks of an APR, can lead to misfolding and aggregation of the affected proteins. The ultrastructure of these aggregates (amorphous or fibrillar) is not yet known.

Similar articles

Cited by

References

    1. Fernandez-Escamilla A. M., Rousseau F., Schymkowitz J., Serrano L., Prediction of sequence-dependent and mutational effects on the aggregation of peptides and proteins. Nat. Biotechnol. 22, 1302–1306 (2004). - PubMed
    1. Rousseau F., Serrano L., Schymkowitz J. W. H., How evolutionary pressure against protein aggregation shaped chaperone specificity. J. Mol. Biol. 355, 1037–1047 (2006). - PubMed
    1. Prabakaran R., Goel D., Kumar S., Gromiha M. M., Aggregation prone regions in human proteome: Insights from large-scale data analyses. Proteins 85, 1099–1118 (2017). - PubMed
    1. Tyedmers J., Mogk A., Bukau B., Cellular strategies for controlling protein aggregation. Nat. Rev. Mol. Cell Biol. 11, 777–788 (2010). - PubMed
    1. Saibil H., Chaperone machines for protein folding, unfolding and disaggregation. Nat. Rev. Mol. Cell Biol. 14, 630–642 (2013). - PMC - PubMed