• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

LMNglyPred:使用预先训练的蛋白质语言模型的嵌入来预测人类 N-连接糖基化位点。

LMNglyPred: prediction of human N-linked glycosylation sites using embeddings from a pre-trained protein language model.

机构信息

School of Computing, Wichita State University, 1845 Fairmount St., Wichita, KS 67260, USA.

Department of Computer Science and Engineering Technology, University of Houston-Downtown, Houston, TX 77002, USA.

出版信息

Glycobiology. 2023 Jun 3;33(5):411-422. doi: 10.1093/glycob/cwad033.

DOI:10.1093/glycob/cwad033
PMID:37067908
Abstract

Protein N-linked glycosylation is an important post-translational mechanism in Homo sapiens, playing essential roles in many vital biological processes. It occurs at the N-X-[S/T] sequon in amino acid sequences, where X can be any amino acid except proline. However, not all N-X-[S/T] sequons are glycosylated; thus, the N-X-[S/T] sequon is a necessary but not sufficient determinant for protein glycosylation. In this regard, computational prediction of N-linked glycosylation sites confined to N-X-[S/T] sequons is an important problem that has not been extensively addressed by the existing methods, especially in regard to the creation of negative sets and leveraging the distilled information from protein language models (pLMs). Here, we developed LMNglyPred, a deep learning-based approach, to predict N-linked glycosylated sites in human proteins using embeddings from a pre-trained pLM. LMNglyPred produces sensitivity, specificity, Matthews Correlation Coefficient, precision, and accuracy of 76.50, 75.36, 0.49, 60.99, and 75.74 percent, respectively, on a benchmark-independent test set. These results demonstrate that LMNglyPred is a robust computational tool to predict N-linked glycosylation sites confined to the N-X-[S/T] sequon.

摘要

蛋白质 N 连接糖基化是人类中一种重要的翻译后机制,在许多重要的生物过程中发挥着重要作用。它发生在氨基酸序列中的 N-X-[S/T] 序列上,其中 X 可以是脯氨酸以外的任何氨基酸。然而,并非所有的 N-X-[S/T] 序列都发生糖基化;因此,N-X-[S/T] 序列是蛋白质糖基化的必要但非充分决定因素。在这方面,受现有方法限制的针对 N-X-[S/T] 序列的 N 连接糖基化位点的计算预测是一个尚未得到广泛解决的重要问题,尤其是在创建负集和利用来自蛋白质语言模型 (pLM) 的提炼信息方面。在这里,我们开发了 LMNglyPred,这是一种基于深度学习的方法,用于使用来自预训练的 pLM 的嵌入来预测人类蛋白质中的 N 连接糖基化位点。LMNglyPred 在独立于基准的测试集上的灵敏度、特异性、马修斯相关系数、精度和准确性分别为 76.50%、75.36%、0.49%、60.99%和 75.74%。这些结果表明,LMNglyPred 是一种强大的计算工具,可用于预测受限于 N-X-[S/T] 序列的 N 连接糖基化位点。

相似文献

1
LMNglyPred: prediction of human N-linked glycosylation sites using embeddings from a pre-trained protein language model.LMNglyPred:使用预先训练的蛋白质语言模型的嵌入来预测人类 N-连接糖基化位点。
Glycobiology. 2023 Jun 3;33(5):411-422. doi: 10.1093/glycob/cwad033.
2
DeepNGlyPred: A Deep Neural Network-Based Approach for Human N-Linked Glycosylation Site Prediction.DeepNGlyPred:一种基于深度神经网络的人类 N 连接糖基化位点预测方法。
Molecules. 2021 Dec 2;26(23):7314. doi: 10.3390/molecules26237314.
3
Computational prediction of N-linked glycosylation incorporating structural properties and patterns.计算预测包含结构特征和模式的 N-连接糖基化。
Bioinformatics. 2012 Sep 1;28(17):2249-55. doi: 10.1093/bioinformatics/bts426. Epub 2012 Jul 10.
4
Structure-based comparative analysis and prediction of N-linked glycosylation sites in evolutionarily distant eukaryotes.基于结构的进化上远缘真核生物 N-连接糖基化位点的比较分析与预测。
Genomics Proteomics Bioinformatics. 2013 Apr;11(2):96-104. doi: 10.1016/j.gpb.2012.11.003. Epub 2013 Feb 28.
5
Nglyc: A Random Forest Method for Prediction of N-Glycosylation Sites in Eukaryotic Protein Sequence.Nglyc:一种用于预测真核生物蛋白质序列中N-糖基化位点的随机森林方法。
Protein Pept Lett. 2020;27(3):178-186. doi: 10.2174/0929866526666191002111404.
6
Ridge regression estimated linear probability model predictions of O-glycosylation in proteins with structural and sequence data.岭回归利用结构和序列数据估计蛋白质 O-糖基化的线性概率模型预测。
BMC Mol Cell Biol. 2019 Jun 28;20(1):21. doi: 10.1186/s12860-019-0200-9.
7
The amino acid at the X position of an Asn-X-Ser sequon is an important determinant of N-linked core-glycosylation efficiency.Asn-X-Ser序列中X位置的氨基酸是N-连接核心糖基化效率的重要决定因素。
J Biol Chem. 1996 Mar 15;271(11):6363-6. doi: 10.1074/jbc.271.11.6363.
8
The amino acid following an asn-X-Ser/Thr sequon is an important determinant of N-linked core glycosylation efficiency.紧跟天冬酰胺- X -丝氨酸/苏氨酸序列之后的氨基酸是N -连接核心糖基化效率的重要决定因素。
Biochemistry. 1998 May 12;37(19):6833-7. doi: 10.1021/bi972217k.
9
Incorporating a transfer learning technique with amino acid embeddings to efficiently predict N-linked glycosylation sites in ion channels.将迁移学习技术与氨基酸嵌入相结合,以有效预测离子通道中的 N-连接糖基化位点。
Comput Biol Med. 2021 Mar;130:104212. doi: 10.1016/j.compbiomed.2021.104212. Epub 2021 Jan 7.
10
Efficiency of N-linked core glycosylation at asparagine-319 of rabies virus glycoprotein is altered by deletions C-terminal to the glycosylation sequon.狂犬病病毒糖蛋白天冬酰胺-319处N-连接核心糖基化的效率因糖基化序列下游的C端缺失而改变。
Biochemistry. 1993 Sep 14;32(36):9465-72. doi: 10.1021/bi00087a026.

引用本文的文献

1
Multimodal deep learning for predicting protein ubiquitination sites.用于预测蛋白质泛素化位点的多模态深度学习
Bioinform Adv. 2025 Aug 20;5(1):vbaf200. doi: 10.1093/bioadv/vbaf200. eCollection 2025.
2
Future Sequon Finder - A novel approach for predicting future N-linked glycosylation sequon locations on viral surface proteins.未来糖基化位点查找器——一种预测病毒表面蛋白上未来N-糖基化位点位置的新方法。
PLoS One. 2025 Aug 14;20(8):e0328174. doi: 10.1371/journal.pone.0328174. eCollection 2025.
3
StackGlyEmbed: prediction of N-linked glycosylation sites using protein language models.
StackGlyEmbed:使用蛋白质语言模型预测N-糖基化位点
Bioinform Adv. 2025 Jun 28;5(1):vbaf146. doi: 10.1093/bioadv/vbaf146. eCollection 2025.
4
Large Language Model (LLM)-Based Advances in Prediction of Post-translational Modification Sites in Proteins.基于大语言模型(LLM)在蛋白质翻译后修饰位点预测方面的进展。
Methods Mol Biol. 2025;2941:313-355. doi: 10.1007/978-1-0716-4623-6_19.
5
A Survey of Biological Function Prediction Methods with Focus on Natural Language Processing (NLP) and Large Language Models (LLM).以自然语言处理(NLP)和大语言模型(LLM)为重点的生物功能预测方法综述。
Methods Mol Biol. 2025;2941:201-225. doi: 10.1007/978-1-0716-4623-6_13.
6
A Survey of Pretrained Protein Language Models.预训练蛋白质语言模型综述
Methods Mol Biol. 2025;2941:1-29. doi: 10.1007/978-1-0716-4623-6_1.
7
MTPrompt-PTM: A Multi-Task Method for Post-Translational Modification Prediction Using Prompt Tuning on a Structure-Aware Protein Language Model.MTPrompt-PTM:一种基于结构感知蛋白质语言模型的提示调整用于翻译后修饰预测的多任务方法。
Biomolecules. 2025 Jun 9;15(6):843. doi: 10.3390/biom15060843.
8
N-Glycosylation Modification of Fzd4 Is Essential for the Fzd4-Wnt-β-Catenin Signalling Axis.Fzd4的N-糖基化修饰对于Fzd4-Wnt-β-连环蛋白信号轴至关重要。
J Cell Mol Med. 2025 Apr;29(7):e70539. doi: 10.1111/jcmm.70539.
9
Artificial Intelligence Transforming Post-Translational Modification Research.人工智能正在改变翻译后修饰研究。
Bioengineering (Basel). 2024 Dec 31;12(1):26. doi: 10.3390/bioengineering12010026.
10
DLBWE-Cys: a deep-learning-based tool for identifying cysteine S-carboxyethylation sites using binary-weight encoding.DLBWE-Cys:一种基于深度学习的工具,用于使用二进制权重编码识别半胱氨酸S-羧乙基化位点。
Front Genet. 2025 Jan 8;15:1464976. doi: 10.3389/fgene.2024.1464976. eCollection 2024.