Suppr超能文献

从蛋白质的进化信息预测其 GTP 相互作用残基、二肽和三肽。

Prediction of GTP interacting residues, dipeptides and tripeptides in a protein from its evolutionary information.

机构信息

Bioinformatics Centre, Institute of Microbial Technology (IMTECH), Chandigarh, India.

出版信息

BMC Bioinformatics. 2010 Jun 3;11:301. doi: 10.1186/1471-2105-11-301.

Abstract

BACKGROUND

Guanosine triphosphate (GTP)-binding proteins play an important role in regulation of G-protein. Thus prediction of GTP interacting residues in a protein is one of the major challenges in the field of the computational biology. In this study, an attempt has been made to develop a computational method for predicting GTP interacting residues in a protein with high accuracy (Acc), precision (Prec) and recall (Rc).

RESULT

All the models developed in this study have been trained and tested on a non-redundant (40% similarity) dataset using five-fold cross-validation. Firstly, we have developed neural network based models using single sequence and PSSM profile and achieved maximum Matthews Correlation Coefficient (MCC) 0.24 (Acc 61.30%) and 0.39 (Acc 68.88%) respectively. Secondly, we have developed a support vector machine (SVM) based models using single sequence and PSSM profile and achieved maximum MCC 0.37 (Prec 0.73, Rc 0.57, Acc 67.98%) and 0.55 (Prec 0.80, Rc 0.73, Acc 77.17%) respectively. In this work, we have introduced a new concept of predicting GTP interacting dipeptide (two consecutive GTP interacting residues) and tripeptide (three consecutive GTP interacting residues) for the first time. We have developed SVM based model for predicting GTP interacting dipeptides using PSSM profile and achieved MCC 0.64 with precision 0.87, recall 0.74 and accuracy 81.37%. Similarly, SVM based model have been developed for predicting GTP interacting tripeptides using PSSM profile and achieved MCC 0.70 with precision 0.93, recall 0.73 and accuracy 83.98%.

CONCLUSION

These results show that PSSM based method performs better than single sequence based method. The prediction models based on dipeptides or tripeptides are more accurate than the traditional model based on single residue. A web server "GTPBinder" http://www.imtech.res.in/raghava/gtpbinder/ based on above models has been developed for predicting GTP interacting residues in a protein.

摘要

背景

三磷酸鸟苷(GTP)结合蛋白在 G 蛋白调节中发挥重要作用。因此,预测蛋白质中的 GTP 相互作用残基是计算生物学领域的主要挑战之一。在这项研究中,我们试图开发一种具有高精度(Acc)、高精准度(Prec)和高召回率(Rc)的预测蛋白质中 GTP 相互作用残基的计算方法。

结果

本研究中开发的所有模型均使用五重交叉验证在非冗余(相似度 40%)数据集上进行了训练和测试。首先,我们使用单序列和 PSSM 图谱开发了基于神经网络的模型,分别获得了最大马修斯相关系数(MCC)0.24(Acc 61.30%)和 0.39(Acc 68.88%)。其次,我们使用单序列和 PSSM 图谱开发了基于支持向量机(SVM)的模型,分别获得了最大 MCC 0.37(Prec 0.73、Rc 0.57、Acc 67.98%)和 0.55(Prec 0.80、Rc 0.73、Acc 77.17%)。在这项工作中,我们首次引入了预测 GTP 相互作用二肽(两个连续的 GTP 相互作用残基)和三肽(三个连续的 GTP 相互作用残基)的新概念。我们使用 PSSM 图谱开发了基于 SVM 的预测 GTP 相互作用二肽的模型,MCC 为 0.64,精度为 0.87,召回率为 0.74,准确率为 81.37%。类似地,我们还开发了基于 SVM 的预测 GTP 相互作用三肽的模型,MCC 为 0.70,精度为 0.93,召回率为 0.73,准确率为 83.98%。

结论

这些结果表明,基于 PSSM 的方法比基于单序列的方法表现更好。基于二肽或三肽的预测模型比传统的基于单个残基的模型更准确。我们开发了一个基于上述模型的网络服务器“GTPBinder”(http://www.imtech.res.in/raghava/gtpbinder/),用于预测蛋白质中的 GTP 相互作用残基。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd1e/3098072/2861e8a615cb/1471-2105-11-301-1.jpg

相似文献

2
Identification of NAD interacting residues in proteins.
BMC Bioinformatics. 2010 Mar 30;11:160. doi: 10.1186/1471-2105-11-160.
3
SVM based prediction of RNA-binding proteins using binding residues and evolutionary information.
J Mol Recognit. 2011 Mar-Apr;24(2):303-13. doi: 10.1002/jmr.1061.
4
Identification of ATP binding residues of a protein from its primary sequence.
BMC Bioinformatics. 2009 Dec 19;10:434. doi: 10.1186/1471-2105-10-434.
5
Identification of mannose interacting residues using local composition.
PLoS One. 2011;6(9):e24039. doi: 10.1371/journal.pone.0024039. Epub 2011 Sep 13.
6
Prediction of RNA binding sites in a protein using SVM and PSSM profile.
Proteins. 2008 Apr;71(1):189-94. doi: 10.1002/prot.21677.
7
Prediction of mitochondrial proteins of malaria parasite using split amino acid composition and PSSM profile.
Amino Acids. 2010 Jun;39(1):101-10. doi: 10.1007/s00726-009-0381-1. Epub 2009 Nov 12.
9
Prediction of FAD interacting residues in a protein from its primary sequence using evolutionary information.
BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S48. doi: 10.1186/1471-2105-11-S1-S48.
10
Prediction of nuclear proteins using SVM and HMM models.
BMC Bioinformatics. 2009 Jan 19;10:22. doi: 10.1186/1471-2105-10-22.

引用本文的文献

2
GraphSite: Ligand Binding Site Classification with Deep Graph Learning.
Biomolecules. 2022 Jul 29;12(8):1053. doi: 10.3390/biom12081053.
4
The Identification of Metal Ion Ligand-Binding Residues by Adding the Reclassified Relative Solvent Accessibility.
Front Genet. 2020 Mar 19;11:214. doi: 10.3389/fgene.2020.00214. eCollection 2020.
5
Exploring the computational methods for protein-ligand binding site prediction.
Comput Struct Biotechnol J. 2020 Feb 17;18:417-426. doi: 10.1016/j.csbj.2020.02.008. eCollection 2020.
6
A Hybrid Model for Predicting Pattern Recognition Receptors Using Evolutionary Information.
Front Immunol. 2020 Jan 30;11:71. doi: 10.3389/fimmu.2020.00071. eCollection 2020.
7
SAMbinder: A Web Server for Predicting S-Adenosyl-L-Methionine Binding Residues of a Protein From Its Amino Acid Sequence.
Front Pharmacol. 2020 Jan 30;10:1690. doi: 10.3389/fphar.2019.01690. eCollection 2019.
10
Identification of metal ion binding sites based on amino acid sequences.
PLoS One. 2017 Aug 30;12(8):e0183756. doi: 10.1371/journal.pone.0183756. eCollection 2017.

本文引用的文献

1
Identification of ATP binding residues of a protein from its primary sequence.
BMC Bioinformatics. 2009 Dec 19;10:434. doi: 10.1186/1471-2105-10-434.
3
SuperSite: dictionary of metabolite and drug binding sites in proteins.
Nucleic Acids Res. 2009 Jan;37(Database issue):D195-200. doi: 10.1093/nar/gkn618. Epub 2008 Oct 8.
4
ProtIdent: a web server for identifying proteases and their types by fusing functional domain and sequential evolution information.
Biochem Biophys Res Commun. 2008 Nov 14;376(2):321-5. doi: 10.1016/j.bbrc.2008.08.125. Epub 2008 Sep 5.
6
Identification of DNA-binding proteins using support vector machines and evolutionary profiles.
BMC Bioinformatics. 2007 Nov 27;8:463. doi: 10.1186/1471-2105-8-463.
7
Prediction of RNA binding sites in a protein using SVM and PSSM profile.
Proteins. 2008 Apr;71(1):189-94. doi: 10.1002/prot.21677.
8
Analysis and prediction of antibacterial peptides.
BMC Bioinformatics. 2007 Jul 23;8:263. doi: 10.1186/1471-2105-8-263.
9
Using Chou's amphiphilic pseudo-amino acid composition and support vector machine for prediction of enzyme subfamily classes.
J Theor Biol. 2007 Oct 7;248(3):546-51. doi: 10.1016/j.jtbi.2007.06.001. Epub 2007 Jun 9.
10
Signal-CF: a subsite-coupled and window-fusing approach for predicting signal peptides.
Biochem Biophys Res Commun. 2007 Jun 8;357(3):633-40. doi: 10.1016/j.bbrc.2007.03.162. Epub 2007 Apr 5.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验