从蛋白质的进化信息预测其 GTP 相互作用残基、二肽和三肽。

Prediction of GTP interacting residues, dipeptides and tripeptides in a protein from its evolutionary information.

机构信息

Bioinformatics Centre, Institute of Microbial Technology (IMTECH), Chandigarh, India.

出版信息

BMC Bioinformatics. 2010 Jun 3;11:301. doi: 10.1186/1471-2105-11-301.

DOI:10.1186/1471-2105-11-301

PMID:20525281

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3098072/

Abstract

BACKGROUND

Guanosine triphosphate (GTP)-binding proteins play an important role in regulation of G-protein. Thus prediction of GTP interacting residues in a protein is one of the major challenges in the field of the computational biology. In this study, an attempt has been made to develop a computational method for predicting GTP interacting residues in a protein with high accuracy (Acc), precision (Prec) and recall (Rc).

RESULT

All the models developed in this study have been trained and tested on a non-redundant (40% similarity) dataset using five-fold cross-validation. Firstly, we have developed neural network based models using single sequence and PSSM profile and achieved maximum Matthews Correlation Coefficient (MCC) 0.24 (Acc 61.30%) and 0.39 (Acc 68.88%) respectively. Secondly, we have developed a support vector machine (SVM) based models using single sequence and PSSM profile and achieved maximum MCC 0.37 (Prec 0.73, Rc 0.57, Acc 67.98%) and 0.55 (Prec 0.80, Rc 0.73, Acc 77.17%) respectively. In this work, we have introduced a new concept of predicting GTP interacting dipeptide (two consecutive GTP interacting residues) and tripeptide (three consecutive GTP interacting residues) for the first time. We have developed SVM based model for predicting GTP interacting dipeptides using PSSM profile and achieved MCC 0.64 with precision 0.87, recall 0.74 and accuracy 81.37%. Similarly, SVM based model have been developed for predicting GTP interacting tripeptides using PSSM profile and achieved MCC 0.70 with precision 0.93, recall 0.73 and accuracy 83.98%.

CONCLUSION

These results show that PSSM based method performs better than single sequence based method. The prediction models based on dipeptides or tripeptides are more accurate than the traditional model based on single residue. A web server "GTPBinder" http://www.imtech.res.in/raghava/gtpbinder/ based on above models has been developed for predicting GTP interacting residues in a protein.

摘要

背景

三磷酸鸟苷（GTP）结合蛋白在 G 蛋白调节中发挥重要作用。因此，预测蛋白质中的 GTP 相互作用残基是计算生物学领域的主要挑战之一。在这项研究中，我们试图开发一种具有高精度（Acc）、高精准度（Prec）和高召回率（Rc）的预测蛋白质中 GTP 相互作用残基的计算方法。

结果

本研究中开发的所有模型均使用五重交叉验证在非冗余（相似度 40%）数据集上进行了训练和测试。首先，我们使用单序列和 PSSM 图谱开发了基于神经网络的模型，分别获得了最大马修斯相关系数（MCC）0.24（Acc 61.30%）和 0.39（Acc 68.88%）。其次，我们使用单序列和 PSSM 图谱开发了基于支持向量机（SVM）的模型，分别获得了最大 MCC 0.37（Prec 0.73、Rc 0.57、Acc 67.98%）和 0.55（Prec 0.80、Rc 0.73、Acc 77.17%）。在这项工作中，我们首次引入了预测 GTP 相互作用二肽（两个连续的 GTP 相互作用残基）和三肽（三个连续的 GTP 相互作用残基）的新概念。我们使用 PSSM 图谱开发了基于 SVM 的预测 GTP 相互作用二肽的模型，MCC 为 0.64，精度为 0.87，召回率为 0.74，准确率为 81.37%。类似地，我们还开发了基于 SVM 的预测 GTP 相互作用三肽的模型，MCC 为 0.70，精度为 0.93，召回率为 0.73，准确率为 83.98%。

结论

这些结果表明，基于 PSSM 的方法比基于单序列的方法表现更好。基于二肽或三肽的预测模型比传统的基于单个残基的模型更准确。我们开发了一个基于上述模型的网络服务器“GTPBinder”（http://www.imtech.res.in/raghava/gtpbinder/），用于预测蛋白质中的 GTP 相互作用残基。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd1e/3098072/2861e8a615cb/1471-2105-11-301-1.jpg

相似文献

Prediction of GTP interacting residues, dipeptides and tripeptides in a protein from its evolutionary information.从蛋白质的进化信息预测其 GTP 相互作用残基、二肽和三肽。

BMC Bioinformatics. 2010 Jun 3;11:301. doi: 10.1186/1471-2105-11-301.

Identification of NAD interacting residues in proteins.鉴定蛋白质中与 NAD 相互作用的残基。

BMC Bioinformatics. 2010 Mar 30;11:160. doi: 10.1186/1471-2105-11-160.

SVM based prediction of RNA-binding proteins using binding residues and evolutionary information.基于支持向量机的 RNA 结合蛋白结合残基和进化信息预测。

J Mol Recognit. 2011 Mar-Apr;24(2):303-13. doi: 10.1002/jmr.1061.

Identification of ATP binding residues of a protein from its primary sequence.从蛋白质的一级序列鉴定其 ATP 结合残基。

BMC Bioinformatics. 2009 Dec 19;10:434. doi: 10.1186/1471-2105-10-434.

Identification of mannose interacting residues using local composition.使用局部组成识别甘露糖相互作用残基。

PLoS One. 2011;6(9):e24039. doi: 10.1371/journal.pone.0024039. Epub 2011 Sep 13.

Prediction of RNA binding sites in a protein using SVM and PSSM profile.使用支持向量机和位置特异性得分矩阵预测蛋白质中的RNA结合位点。

Proteins. 2008 Apr;71(1):189-94. doi: 10.1002/prot.21677.

Prediction of mitochondrial proteins of malaria parasite using split amino acid composition and PSSM profile.利用氨基酸组成拆分和 PSSM 图谱预测疟原虫的线粒体蛋白。

Amino Acids. 2010 Jun;39(1):101-10. doi: 10.1007/s00726-009-0381-1. Epub 2009 Nov 12.

Prediction of vitamin interacting residues in a vitamin binding protein using evolutionary information.利用进化信息预测维生素结合蛋白中的维生素相互作用残基。

BMC Bioinformatics. 2013 Feb 7;14:44. doi: 10.1186/1471-2105-14-44.

Prediction of FAD interacting residues in a protein from its primary sequence using evolutionary information.利用进化信息从蛋白质的一级序列预测 FAD 相互作用残基。

BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S48. doi: 10.1186/1471-2105-11-S1-S48.

Prediction of nuclear proteins using SVM and HMM models.使用支持向量机和隐马尔可夫模型预测核蛋白。

BMC Bioinformatics. 2009 Jan 19;10:22. doi: 10.1186/1471-2105-10-22.

引用本文的文献

A comprehensive review of protein-centric predictors for biomolecular interactions: from proteins to nucleic acids and beyond.蛋白质中心预测因子在生物分子相互作用研究中的综合综述：从蛋白质到核酸及其他。

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae162.

GraphSite: Ligand Binding Site Classification with Deep Graph Learning.GraphSite：基于深度图学习的配体结合位点分类。

Biomolecules. 2022 Jul 29;12(8):1053. doi: 10.3390/biom12081053.

Risk prediction in cutaneous melanoma patients from their clinico-pathological features: superiority of clinical data over gene expression data.根据临床病理特征对皮肤黑色素瘤患者进行风险预测：临床数据优于基因表达数据。

Heliyon. 2020 Aug 29;6(8):e04811. doi: 10.1016/j.heliyon.2020.e04811. eCollection 2020 Aug.

The Identification of Metal Ion Ligand-Binding Residues by Adding the Reclassified Relative Solvent Accessibility.通过添加重新分类的相对溶剂可及性来鉴定金属离子配体结合残基。

Front Genet. 2020 Mar 19;11:214. doi: 10.3389/fgene.2020.00214. eCollection 2020.

Exploring the computational methods for protein-ligand binding site prediction.探索蛋白质-配体结合位点预测的计算方法。

Comput Struct Biotechnol J. 2020 Feb 17;18:417-426. doi: 10.1016/j.csbj.2020.02.008. eCollection 2020.

A Hybrid Model for Predicting Pattern Recognition Receptors Using Evolutionary Information.一种利用进化信息预测模式识别受体的混合模型。

Front Immunol. 2020 Jan 30;11:71. doi: 10.3389/fimmu.2020.00071. eCollection 2020.

SAMbinder: A Web Server for Predicting S-Adenosyl-L-Methionine Binding Residues of a Protein From Its Amino Acid Sequence.SAMbinder：一个用于从氨基酸序列预测蛋白质S-腺苷-L-甲硫氨酸结合残基的网络服务器。

Front Pharmacol. 2020 Jan 30;10:1690. doi: 10.3389/fphar.2019.01690. eCollection 2019.

NAGbinder: An approach for identifying N-acetylglucosamine interacting residues of a protein from its primary sequence.NAGbinder：一种从蛋白质一级序列中识别 N-乙酰葡萄糖胺相互作用残基的方法。

Protein Sci. 2020 Jan;29(1):201-210. doi: 10.1002/pro.3761. Epub 2019 Nov 7.

Toward more accurate prediction of caspase cleavage sites: a comprehensive review of current methods, tools and features.为了更准确地预测半胱氨酸天冬氨酸蛋白酶切割位点：当前方法、工具和特征的全面综述。

Brief Bioinform. 2019 Sep 27;20(5):1669-1684. doi: 10.1093/bib/bby041.

Identification of metal ion binding sites based on amino acid sequences.基于氨基酸序列鉴定金属离子结合位点。

PLoS One. 2017 Aug 30;12(8):e0183756. doi: 10.1371/journal.pone.0183756. eCollection 2017.

本文引用的文献

Identification of ATP binding residues of a protein from its primary sequence.从蛋白质的一级序列鉴定其 ATP 结合残基。

BMC Bioinformatics. 2009 Dec 19;10:434. doi: 10.1186/1471-2105-10-434.

A machine learning based method for the prediction of secretory proteins using amino acid composition, their order and similarity-search.一种基于机器学习的方法，利用氨基酸组成、顺序和相似性搜索来预测分泌蛋白。

In Silico Biol. 2008;8(2):129-40.

SuperSite: dictionary of metabolite and drug binding sites in proteins.超级位点：蛋白质中代谢物和药物结合位点词典

Nucleic Acids Res. 2009 Jan;37(Database issue):D195-200. doi: 10.1093/nar/gkn618. Epub 2008 Oct 8.

ProtIdent: a web server for identifying proteases and their types by fusing functional domain and sequential evolution information.ProtIdent：一个通过融合功能域和序列进化信息来识别蛋白酶及其类型的网络服务器。

Biochem Biophys Res Commun. 2008 Nov 14;376(2):321-5. doi: 10.1016/j.bbrc.2008.08.125. Epub 2008 Sep 5.

Using grey dynamic modeling and pseudo amino acid composition to predict protein structural classes.利用灰色动态建模和伪氨基酸组成预测蛋白质结构类别。

J Comput Chem. 2008 Sep;29(12):2018-24. doi: 10.1002/jcc.20955.

Identification of DNA-binding proteins using support vector machines and evolutionary profiles.利用支持向量机和进化谱鉴定DNA结合蛋白。

BMC Bioinformatics. 2007 Nov 27;8:463. doi: 10.1186/1471-2105-8-463.

Prediction of RNA binding sites in a protein using SVM and PSSM profile.使用支持向量机和位置特异性得分矩阵预测蛋白质中的RNA结合位点。

Proteins. 2008 Apr;71(1):189-94. doi: 10.1002/prot.21677.

Analysis and prediction of antibacterial peptides.抗菌肽的分析与预测

BMC Bioinformatics. 2007 Jul 23;8:263. doi: 10.1186/1471-2105-8-263.

Using Chou's amphiphilic pseudo-amino acid composition and support vector machine for prediction of enzyme subfamily classes.利用周氏两亲性伪氨基酸组成和支持向量机预测酶亚家族类别。

J Theor Biol. 2007 Oct 7;248(3):546-51. doi: 10.1016/j.jtbi.2007.06.001. Epub 2007 Jun 9.

Signal-CF: a subsite-coupled and window-fusing approach for predicting signal peptides.信号-CF：一种用于预测信号肽的亚位点耦合和窗口融合方法。

Biochem Biophys Res Commun. 2007 Jun 8;357(3):633-40. doi: 10.1016/j.bbrc.2007.03.162. Epub 2007 Apr 5.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

从蛋白质的进化信息预测其 GTP 相互作用残基、二肽和三肽。

Prediction of GTP interacting residues, dipeptides and tripeptides in a protein from its evolutionary information.

机构信息

出版信息

BACKGROUND

RESULT

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献