• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

iDPGK:基于序列特征的赖氨酸磷酸甘油化位点的表征和鉴定。

iDPGK: characterization and identification of lysine phosphoglycerylation sites based on sequence-based features.

机构信息

Department of Medical Research, Hsinchu Mackay Memorial Hospital, Hsinchu City 300, Taiwan.

Department of Medicine, Mackay Medical College, New Taipei City 252, Taiwan.

出版信息

BMC Bioinformatics. 2020 Dec 9;21(1):568. doi: 10.1186/s12859-020-03916-5.

DOI:10.1186/s12859-020-03916-5
PMID:33297954
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7727188/
Abstract

BACKGROUND

Protein phosphoglycerylation, the addition of a 1,3-bisphosphoglyceric acid (1,3-BPG) to a lysine residue of a protein and thus to form a 3-phosphoglyceryl-lysine, is a reversible and non-enzymatic post-translational modification (PTM) and plays a regulatory role in glucose metabolism and glycolytic process. As the number of experimentally verified phosphoglycerylated sites has increased significantly, statistical or machine learning methods are imperative for investigating the characteristics of phosphoglycerylation sites. Currently, research into phosphoglycerylation is very limited, and only a few resources are available for the computational identification of phosphoglycerylation sites.

RESULT

We present a bioinformatics investigation of phosphoglycerylation sites based on sequence-based features. The TwoSampleLogo analysis reveals that the regions surrounding the phosphoglycerylation sites contain a high relatively of positively charged amino acids, especially in the upstream flanking region. Additionally, the non-polar and aliphatic amino acids are more abundant surrounding phosphoglycerylated lysine following the results of PTM-Logo, which may play a functional role in discriminating between phosphoglycerylation and non-phosphoglycerylation sites. Many types of features were adopted to build the prediction model on the training dataset, including amino acid composition, amino acid pair composition, positional weighted matrix and position-specific scoring matrix. Further, to improve the predictive power, numerous top features ranked by F-score were considered as the final combination for classification, and thus the predictive models were trained using DT, RF and SVM classifiers. Evaluation by five-fold cross-validation showed that the selected features was most effective in discriminating between phosphoglycerylated and non-phosphoglycerylated sites.

CONCLUSION

The SVM model trained with the selected sequence-based features performed well, with a sensitivity of 77.5%, a specificity of 73.6%, an accuracy of 74.9%, and a Matthews Correlation Coefficient value of 0.49. Furthermore, the model also consistently provides the effective performance in independent testing set, yielding sensitivity of 75.7% and specificity of 64.9%. Finally, the model has been implemented as a web-based system, namely iDPGK, which is now freely available at http://mer.hc.mmh.org.tw/iDPGK/ .

摘要

背景

蛋白质磷酸甘油化是指在蛋白质的赖氨酸残基上添加 1,3-二磷酸甘油酸(1,3-BPG),从而形成 3-磷酸甘油酰-赖氨酸,是一种可逆的非酶促翻译后修饰(PTM),在葡萄糖代谢和糖酵解过程中发挥调节作用。随着实验验证的磷酸甘油化位点数量的显著增加,统计或机器学习方法对于研究磷酸甘油化位点的特征至关重要。目前,磷酸甘油化的研究非常有限,只有少数资源可用于计算识别磷酸甘油化位点。

结果

我们基于序列特征对磷酸甘油化位点进行了生物信息学研究。TwoSampleLogo 分析表明,磷酸甘油化位点周围的区域含有较高的带正电荷的氨基酸,特别是在上游侧翼区域。此外,PTM-Logo 的结果表明,磷酸甘油化赖氨酸周围的非极性和脂肪族氨基酸更为丰富,这可能在区分磷酸甘油化和非磷酸甘油化位点方面发挥功能作用。在训练数据集上,采用了多种类型的特征来构建预测模型,包括氨基酸组成、氨基酸对组成、位置加权矩阵和位置特异性评分矩阵。此外,为了提高预测能力,我们考虑了按 F 分数排名的许多顶级特征作为最终的分类组合,然后使用 DT、RF 和 SVM 分类器对预测模型进行训练。五重交叉验证评估表明,所选特征在区分磷酸甘油化和非磷酸甘油化位点方面最为有效。

结论

使用所选基于序列的特征训练的 SVM 模型表现良好,其敏感性为 77.5%,特异性为 73.6%,准确性为 74.9%,马修斯相关系数值为 0.49。此外,该模型在独立测试集中也表现出一致的有效性,敏感性为 75.7%,特异性为 64.9%。最后,该模型已被实现为一个基于网络的系统,即 iDPGK,现在可以在 http://mer.hc.mmh.org.tw/iDPGK/ 上免费获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/7727188/84e2150efd21/12859_2020_3916_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/7727188/2239ff29b877/12859_2020_3916_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/7727188/3a2dac22a30d/12859_2020_3916_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/7727188/64c08c8be04b/12859_2020_3916_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/7727188/937e46a1697d/12859_2020_3916_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/7727188/7e817efc96d4/12859_2020_3916_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/7727188/84e2150efd21/12859_2020_3916_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/7727188/2239ff29b877/12859_2020_3916_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/7727188/3a2dac22a30d/12859_2020_3916_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/7727188/64c08c8be04b/12859_2020_3916_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/7727188/937e46a1697d/12859_2020_3916_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/7727188/7e817efc96d4/12859_2020_3916_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ec/7727188/84e2150efd21/12859_2020_3916_Fig6_HTML.jpg

相似文献

1
iDPGK: characterization and identification of lysine phosphoglycerylation sites based on sequence-based features.iDPGK:基于序列特征的赖氨酸磷酸甘油化位点的表征和鉴定。
BMC Bioinformatics. 2020 Dec 9;21(1):568. doi: 10.1186/s12859-020-03916-5.
2
RAM-PGK: Prediction of Lysine Phosphoglycerylation Based on Residue Adjacency Matrix.RAM-PGK:基于残基邻接矩阵的赖氨酸磷酸甘油化预测。
Genes (Basel). 2020 Dec 20;11(12):1524. doi: 10.3390/genes11121524.
3
Bigram-PGK: phosphoglycerylation prediction using the technique of bigram probabilities of position specific scoring matrix.双元模型-PGK:基于位置特异得分矩阵双元概率技术的磷酸甘油酰化预测。
BMC Mol Cell Biol. 2019 Dec 20;20(Suppl 2):57. doi: 10.1186/s12860-019-0240-1.
4
Characterization and identification of lysine glutarylation based on intrinsic interdependence between positions in the substrate sites.基于底物结合位点中位置的内在相关性对赖氨酸瓜氨酸化的表征和鉴定。
BMC Bioinformatics. 2019 Feb 4;19(Suppl 13):384. doi: 10.1186/s12859-018-2394-9.
5
EvolStruct-Phogly: incorporating structural properties and evolutionary information from profile bigrams for the phosphoglycerylation prediction.EvolStruct-Phogly:从二联体轮廓中整合结构特性和进化信息,用于磷酸甘油化预测。
BMC Genomics. 2019 Apr 18;19(Suppl 9):984. doi: 10.1186/s12864-018-5383-5.
6
Predicting lysine phosphoglycerylation with fuzzy SVM by incorporating k-spaced amino acid pairs into Chou׳s general PseAAC.通过将k间隔氨基酸对纳入周氏广义伪氨基酸组成,利用模糊支持向量机预测赖氨酸磷酸甘油化。
J Theor Biol. 2016 May 21;397:145-50. doi: 10.1016/j.jtbi.2016.02.020. Epub 2016 Feb 22.
7
Predicting protein lysine phosphoglycerylation sites by hybridizing many sequence based features.通过整合多种基于序列的特征来预测蛋白质赖氨酸磷酸甘油化位点。
Mol Biosyst. 2017 May 2;13(5):874-882. doi: 10.1039/c6mb00875e.
8
PLP_FS: prediction of lysine phosphoglycerylation sites in protein using support vector machine and fusion of multiple F_Score feature selection.使用支持向量机和融合多个 F-Score 特征选择的方法预测蛋白质中的赖氨酸磷酸化糖基化位点
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac306.
9
Investigation and identification of protein carbonylation sites based on position-specific amino acid composition and physicochemical features.基于位点特异性氨基酸组成和理化特性的蛋白质羰基化位点的研究与鉴定
BMC Bioinformatics. 2017 Mar 14;18(Suppl 3):66. doi: 10.1186/s12859-017-1472-8.
10
UbiSite: incorporating two-layered machine learning method with substrate motifs to predict ubiquitin-conjugation site on lysines.UbiSite:结合具有底物基序的两层机器学习方法来预测赖氨酸上的泛素结合位点。
BMC Syst Biol. 2016 Jan 11;10 Suppl 1(Suppl 1):6. doi: 10.1186/s12918-015-0246-z.

本文引用的文献

1
Bigram-PGK: phosphoglycerylation prediction using the technique of bigram probabilities of position specific scoring matrix.双元模型-PGK:基于位置特异得分矩阵双元概率技术的磷酸甘油酰化预测。
BMC Mol Cell Biol. 2019 Dec 20;20(Suppl 2):57. doi: 10.1186/s12860-019-0240-1.
2
Characterization and Identification of Lysine Succinylation Sites based on Deep Learning Method.基于深度学习方法的赖氨酸琥珀酰化修饰位点的鉴定与特征分析。
Sci Rep. 2019 Nov 7;9(1):16175. doi: 10.1038/s41598-019-52552-4.
3
PTM-Logo: a program for generation of sequence logos based on position-specific background amino-acid probabilities.
PTM-Logo:一个基于位置特异性背景氨基酸概率生成序列 logo 的程序。
Bioinformatics. 2019 Dec 15;35(24):5313-5314. doi: 10.1093/bioinformatics/btz568.
4
EvolStruct-Phogly: incorporating structural properties and evolutionary information from profile bigrams for the phosphoglycerylation prediction.EvolStruct-Phogly:从二联体轮廓中整合结构特性和进化信息,用于磷酸甘油化预测。
BMC Genomics. 2019 Apr 18;19(Suppl 9):984. doi: 10.1186/s12864-018-5383-5.
5
A Random Forests Quantile Classifier for Class Imbalanced Data.用于类别不平衡数据的随机森林分位数分类器。
Pattern Recognit. 2019 Jun;90:232-249. doi: 10.1016/j.patcog.2019.01.036. Epub 2019 Jan 29.
6
Characterization and identification of lysine glutarylation based on intrinsic interdependence between positions in the substrate sites.基于底物结合位点中位置的内在相关性对赖氨酸瓜氨酸化的表征和鉴定。
BMC Bioinformatics. 2019 Feb 4;19(Suppl 13):384. doi: 10.1186/s12859-018-2394-9.
7
Classification and interaction in random forests.随机森林中的分类与交互作用
Proc Natl Acad Sci U S A. 2018 Feb 20;115(8):1690-1692. doi: 10.1073/pnas.1800256115. Epub 2018 Feb 12.
8
A framework for sensitivity analysis of decision trees.决策树敏感性分析框架。
Cent Eur J Oper Res. 2018;26(1):135-159. doi: 10.1007/s10100-017-0479-6. Epub 2017 May 24.
9
PLMD: An updated data resource of protein lysine modifications.PLMD:蛋白质赖氨酸修饰的更新数据资源。
J Genet Genomics. 2017 May 20;44(5):243-250. doi: 10.1016/j.jgg.2017.03.007. Epub 2017 May 3.
10
iPGK-PseAAC: Identify Lysine Phosphoglycerylation Sites in Proteins by Incorporating Four Different Tiers of Amino Acid Pairwise Coupling Information into the General PseAAC.iPGK-PseAAC:通过将四种不同层次的氨基酸成对耦合信息整合到通用伪氨基酸组成中识别蛋白质中的赖氨酸磷酸甘油化位点。
Med Chem. 2017;13(6):552-559. doi: 10.2174/1573406413666170515120507.