• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于预测真核蛋白序列中 N-、O-和 C-糖基化位点的计算平台。

In silico platform for prediction of N-, O- and C-glycosites in eukaryotic protein sequences.

机构信息

Bioinformatic Centre, Institute of Microbial Technology, Chandigarh, India.

出版信息

PLoS One. 2013 Jun 28;8(6):e67008. doi: 10.1371/journal.pone.0067008. Print 2013.

DOI:10.1371/journal.pone.0067008
PMID:23840574
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3695939/
Abstract

Glycosylation is one of the most abundant and an important post-translational modification of proteins. Glycosylated proteins (glycoproteins) are involved in various cellular biological functions like protein folding, cell-cell interactions, cell recognition and host-pathogen interactions. A large number of eukaryotic glycoproteins also have therapeutic and potential technology applications. Therefore, characterization and analysis of glycosites (glycosylated residues) in these proteins is of great interest to biologists. In order to cater these needs a number of in silico tools have been developed over the years, however, a need to get even better prediction tools remains. Therefore, in this study we have developed a new webserver GlycoEP for more accurate prediction of N-linked, O-linked and C-linked glycosites in eukaryotic glycoproteins using two larger datasets, namely, standard and advanced datasets. In case of standard datasets no two glycosylated proteins are more similar than 40%; advanced datasets are highly non-redundant where no two glycosites' patterns (as defined in methods) have more than 60% similarity. Further, based on our results with several algorihtms developed using different machine-learning techniques, we found Support Vector Machine (SVM) as optimum tool to develop glycosite prediction models. Accordingly, using our more stringent and non-redundant advanced datasets, the SVM based models developed in this study achieved a prediction accuracy of 84.26%, 86.87% and 91.43% with corresponding MCC of 0.54, 0.20 and 0.78, for N-, O- and C-linked glycosites, respectively. The best performing models trained on advanced datasets were then implemented as a user-friendly web server GlycoEP (http://www.imtech.res.in/raghava/glycoep/). Additionally, this server provides prediction models developed on standard datasets and allows users to scan sequons in input protein sequences.

摘要

糖基化是蛋白质最丰富和最重要的翻译后修饰之一。糖基化蛋白(glycoproteins)参与各种细胞生物学功能,如蛋白质折叠、细胞-细胞相互作用、细胞识别和宿主-病原体相互作用。大量真核糖蛋白也具有治疗和潜在的技术应用。因此,对生物学家来说,对这些蛋白质中的糖基化位点(glycosylated residues)进行特征描述和分析非常重要。为了满足这些需求,多年来已经开发了许多计算机工具,但是仍然需要更好的预测工具。因此,在这项研究中,我们使用两个更大的数据集(标准数据集和高级数据集)开发了一种新的网络服务器 GlycoEP,用于更准确地预测真核糖蛋白中的 N-连接、O-连接和 C-连接糖基化位点。对于标准数据集,没有两个糖基化蛋白彼此相似超过 40%;高级数据集高度非冗余,其中没有两个糖基化位点模式(如方法中定义)具有超过 60%的相似性。此外,根据我们使用不同机器学习技术开发的几种算法的结果,我们发现支持向量机(SVM)是开发糖基化预测模型的最佳工具。因此,使用我们更严格和非冗余的高级数据集,本研究中开发的基于 SVM 的模型在预测 N-、O-和 C-连接糖基化位点时,分别达到了 84.26%、86.87%和 91.43%的准确率,相应的 MCC 为 0.54、0.20 和 0.78。然后,将在高级数据集上训练的表现最佳的模型实现为一个用户友好的网络服务器 GlycoEP(http://www.imtech.res.in/raghava/glycoep/)。此外,该服务器还提供基于标准数据集开发的预测模型,并允许用户扫描输入蛋白质序列中的信号肽。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2687/3695939/0bfdbadb9612/pone.0067008.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2687/3695939/986af7aeb5a3/pone.0067008.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2687/3695939/92f47bf6f5e9/pone.0067008.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2687/3695939/0bfdbadb9612/pone.0067008.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2687/3695939/986af7aeb5a3/pone.0067008.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2687/3695939/92f47bf6f5e9/pone.0067008.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2687/3695939/0bfdbadb9612/pone.0067008.g003.jpg

相似文献

1
In silico platform for prediction of N-, O- and C-glycosites in eukaryotic protein sequences.用于预测真核蛋白序列中 N-、O-和 C-糖基化位点的计算平台。
PLoS One. 2013 Jun 28;8(6):e67008. doi: 10.1371/journal.pone.0067008. Print 2013.
2
GlycoPP: a webserver for prediction of N- and O-glycosites in prokaryotic protein sequences.GlycoPP:一个用于预测原核蛋白序列中 N-和 O-糖基化位点的网络服务器。
PLoS One. 2012;7(7):e40155. doi: 10.1371/journal.pone.0040155. Epub 2012 Jul 9.
3
N-GlyDE: a two-stage N-linked glycosylation site prediction incorporating gapped dipeptides and pattern-based encoding.N-GlyDE:一种两阶段的 N 连接糖基化位点预测方法,结合了缺口二肽和基于模式的编码。
Sci Rep. 2019 Nov 4;9(1):15975. doi: 10.1038/s41598-019-52341-z.
4
O-GlyThr: Prediction of human O-linked threonine glycosites using multi-feature fusion.O-GlyThr:使用多特征融合预测人类 O 链接苏氨酸糖基化位点。
Int J Biol Macromol. 2023 Jul 1;242(Pt 2):124761. doi: 10.1016/j.ijbiomac.2023.124761. Epub 2023 May 6.
5
Glycosylation site prediction using ensembles of Support Vector Machine classifiers.使用支持向量机分类器集成进行糖基化位点预测。
BMC Bioinformatics. 2007 Nov 9;8:438. doi: 10.1186/1471-2105-8-438.
6
ProGlycProt: a repository of experimentally characterized prokaryotic glycoproteins.ProGlycProt:一个经过实验验证的原核糖蛋白数据库。
Nucleic Acids Res. 2012 Jan;40(Database issue):D388-93. doi: 10.1093/nar/gkr911. Epub 2011 Oct 28.
7
Prediction of mucin-type O-glycosylation sites in mammalian proteins using the composition of k-spaced amino acid pairs.利用k间隔氨基酸对的组成预测哺乳动物蛋白质中的粘蛋白型O-糖基化位点
BMC Bioinformatics. 2008 Feb 18;9:101. doi: 10.1186/1471-2105-9-101.
8
GlycoMine: a machine learning-based approach for predicting N-, C- and O-linked glycosylation in the human proteome.糖基分析软件(GlycoMine):一种基于机器学习的方法,用于预测人类蛋白质组中的 N-、C-和 O-糖基化。
Bioinformatics. 2015 May 1;31(9):1411-9. doi: 10.1093/bioinformatics/btu852. Epub 2015 Jan 6.
9
Prediction of N-linked glycosylation sites using position relative features and statistical moments.利用位置相关特征和统计矩预测N-糖基化位点
PLoS One. 2017 Aug 10;12(8):e0181966. doi: 10.1371/journal.pone.0181966. eCollection 2017.
10
ESLpred2: improved method for predicting subcellular localization of eukaryotic proteins.ESLpred2:预测真核生物蛋白质亚细胞定位的改进方法。
BMC Bioinformatics. 2008 Nov 28;9:503. doi: 10.1186/1471-2105-9-503.

引用本文的文献

1
StackGlyEmbed: prediction of N-linked glycosylation sites using protein language models.StackGlyEmbed:使用蛋白质语言模型预测N-糖基化位点
Bioinform Adv. 2025 Jun 28;5(1):vbaf146. doi: 10.1093/bioadv/vbaf146. eCollection 2025.
2
Complex transcription regulation of acidic chitinase suggests fine-tuning of digestive processes in Drosera binata.酸性几丁质酶的复杂转录调控表明二齿捕蝇草消化过程的精细调节。
Planta. 2025 Jan 12;261(2):32. doi: 10.1007/s00425-025-04607-2.
3
The Historical Evolution and Significance of Multiple Sequence Alignment in Molecular Structure and Function Prediction.

本文引用的文献

1
GlycoPP: a webserver for prediction of N- and O-glycosites in prokaryotic protein sequences.GlycoPP:一个用于预测原核蛋白序列中 N-和 O-糖基化位点的网络服务器。
PLoS One. 2012;7(7):e40155. doi: 10.1371/journal.pone.0040155. Epub 2012 Jul 9.
2
Prediction of lysine post-translational modifications using bioinformatic tools.利用生物信息学工具预测赖氨酸的翻译后修饰。
Essays Biochem. 2012;52:165-77. doi: 10.1042/bse0520165.
3
ProGlycProt: a repository of experimentally characterized prokaryotic glycoproteins.ProGlycProt:一个经过实验验证的原核糖蛋白数据库。
多重序列比对在分子结构与功能预测中的历史演变及意义
Biomolecules. 2024 Nov 29;14(12):1531. doi: 10.3390/biom14121531.
4
Immunogenic recombinant Mayaro virus-like particles present natively assembled glycoprotein.具有免疫原性的重组马亚罗病毒样颗粒呈现天然组装的糖蛋白。
NPJ Vaccines. 2024 Dec 17;9(1):243. doi: 10.1038/s41541-024-01021-9.
5
Plasmodium vivax antigen candidate prediction improves with the addition of Plasmodium falciparum data.恶性疟原虫抗原候选预测的改进得益于恶性疟原虫数据的增加。
NPJ Syst Biol Appl. 2024 Nov 13;10(1):133. doi: 10.1038/s41540-024-00465-y.
6
Chromosome level assemblies of Nakaseomyces (Candida) bracarensis uncover two distinct clades and define its adhesin repertoire.巴氏假丝酵母(念珠菌)染色体水平基因组组装揭示了两个不同的分支,并定义了其黏附素库。
BMC Genomics. 2024 Nov 7;25(1):1053. doi: 10.1186/s12864-024-10979-8.
7
Boltzmann Model Predicts Glycan Structures from Lectin Binding.玻尔兹曼模型从凝集素结合预测聚糖结构。
Anal Chem. 2024 May 28;96(21):8332-8341. doi: 10.1021/acs.analchem.3c04992. Epub 2024 May 8.
8
Positive-unlabeled learning identifies vaccine candidate antigens in the malaria parasite Plasmodium falciparum.正未标记学习可识别恶性疟原虫中的疫苗候选抗原。
NPJ Syst Biol Appl. 2024 Apr 27;10(1):44. doi: 10.1038/s41540-024-00365-1.
9
DOTAD: A Database of Therapeutic Antibody Developability.DOTAD:治疗性抗体可开发性数据库。
Interdiscip Sci. 2024 Sep;16(3):623-634. doi: 10.1007/s12539-024-00613-2. Epub 2024 Mar 26.
10
Structural analysis and functional evaluation of the disordered ß-hexosyltransferase region from .来自……的无序β-己糖基转移酶区域的结构分析和功能评估
Front Bioeng Biotechnol. 2023 Dec 14;11:1291245. doi: 10.3389/fbioe.2023.1291245. eCollection 2023.
Nucleic Acids Res. 2012 Jan;40(Database issue):D388-93. doi: 10.1093/nar/gkr911. Epub 2011 Oct 28.
4
Similarities and differences in the glycosylation mechanisms in prokaryotes and eukaryotes.原核生物和真核生物中糖基化机制的异同。
Int J Microbiol. 2010;2010:148178. doi: 10.1155/2010/148178. Epub 2011 Jan 27.
5
Prediction of GTP interacting residues, dipeptides and tripeptides in a protein from its evolutionary information.从蛋白质的进化信息预测其 GTP 相互作用残基、二肽和三肽。
BMC Bioinformatics. 2010 Jun 3;11:301. doi: 10.1186/1471-2105-11-301.
6
N-Linked glycoengineering for human therapeutic proteins in bacteria.在细菌中对人治疗性蛋白进行 N-连接糖基化工程改造。
Biotechnol Lett. 2010 Sep;32(9):1189-98. doi: 10.1007/s10529-010-0289-6. Epub 2010 May 7.
7
Prediction of glycosylation sites using random forests.使用随机森林预测糖基化位点。
BMC Bioinformatics. 2008 Nov 27;9:500. doi: 10.1186/1471-2105-9-500.
8
Predicting protein post-translational modifications using meta-analysis of proteome scale data sets.利用蛋白质组规模数据集的荟萃分析预测蛋白质翻译后修饰
Mol Cell Proteomics. 2009 Feb;8(2):365-79. doi: 10.1074/mcp.M800332-MCP200. Epub 2008 Oct 28.
9
Glycosylation site prediction using ensembles of Support Vector Machine classifiers.使用支持向量机分类器集成进行糖基化位点预测。
BMC Bioinformatics. 2007 Nov 9;8:438. doi: 10.1186/1471-2105-8-438.
10
Prediction of RNA binding sites in a protein using SVM and PSSM profile.使用支持向量机和位置特异性得分矩阵预测蛋白质中的RNA结合位点。
Proteins. 2008 Apr;71(1):189-94. doi: 10.1002/prot.21677.