利用核心界面残基和支持向量机预测蛋白质-蛋白质结合位点

Prediction of protein-protein binding site by using core interface residue and support vector machine.

作者信息

Li Nan, Sun Zhonghua, Jiang Fan

机构信息

Beijing National Laboratory for Condensed Matter Physics, Institute of Physics, Chinese Academy of Sciences, Beijing, PR China.

出版信息

BMC Bioinformatics. 2008 Dec 22;9:553. doi: 10.1186/1471-2105-9-553.

DOI:10.1186/1471-2105-9-553

PMID:19102736

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2627892/

Abstract

BACKGROUND

The prediction of protein-protein binding site can provide structural annotation to the protein interaction data from proteomics studies. This is very important for the biological application of the protein interaction data that is increasing rapidly. Moreover, methods for predicting protein interaction sites can also provide crucial information for improving the speed and accuracy of protein docking methods.

RESULTS

In this work, we describe a binding site prediction method by designing a new residue neighbour profile and by selecting only the core-interface residues for SVM training. The residue neighbour profile includes both the sequential and the spatial neighbour residues of an interface residue, which is a more complete description of the physical and chemical characteristics surrounding the interface residue. The concept of core interface is applied in selecting the interface residues for training the SVM models, which is shown to result in better discrimination between the core interface and other residues. The best SVM model trained was tested on a test set of 50 randomly selected proteins. The sensitivity, specificity, and MCC for the prediction of the core interface residues were 60.6%, 53.4%, and 0.243, respectively. Our prediction results on this test set were compared with other three binding site prediction methods and found to perform better. Furthermore, our method was tested on the 101 unbound proteins from the protein-protein interaction benchmark v2.0. The sensitivity, specificity, and MCC of this test were 57.5%, 32.5%, and 0.168, respectively.

CONCLUSION

By improving both the descriptions of the interface residues and their surrounding environment and the training strategy, better SVM models were obtained and shown to outperform previous methods. Our tests on the unbound protein structures suggest further improvement is possible.

摘要

背景

蛋白质-蛋白质结合位点的预测可为蛋白质组学研究中的蛋白质相互作用数据提供结构注释。这对于迅速增加的蛋白质相互作用数据的生物学应用非常重要。此外，预测蛋白质相互作用位点的方法还可为提高蛋白质对接方法的速度和准确性提供关键信息。

结果

在这项工作中，我们描述了一种结合位点预测方法，该方法通过设计新的残基邻域概况并仅选择核心界面残基进行支持向量机（SVM）训练。残基邻域概况包括界面残基的序列和空间邻域残基，这是对界面残基周围物理和化学特征更完整的描述。核心界面的概念用于选择用于训练SVM模型的界面残基，结果表明这能更好地区分核心界面和其他残基。对训练得到的最佳SVM模型在50个随机选择的蛋白质测试集上进行测试。预测核心界面残基的灵敏度、特异性和马修斯相关系数（MCC）分别为60.6%、53.4%和0.243。我们在这个测试集上的预测结果与其他三种结合位点预测方法进行比较，发现表现更好。此外，我们的方法在蛋白质-蛋白质相互作用基准v2.0的101个未结合蛋白质上进行了测试。该测试的灵敏度、特异性和MCC分别为57.5%、32.5%和0.168。

结论

通过改进界面残基及其周围环境的描述以及训练策略，获得了更好的SVM模型，且表现优于先前的方法。我们对未结合蛋白质结构的测试表明仍有进一步改进的可能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5926/2627892/155dda9d749f/1471-2105-9-553-1.jpg

相似文献

Prediction of protein-protein binding site by using core interface residue and support vector machine.利用核心界面残基和支持向量机预测蛋白质-蛋白质结合位点

BMC Bioinformatics. 2008 Dec 22;9:553. doi: 10.1186/1471-2105-9-553.

Sequence-based identification of interface residues by an integrative profile combining hydrophobic and evolutionary information.基于序列的界面残基识别方法，整合了疏水作用和进化信息的综合轮廓。

BMC Bioinformatics. 2010 Jul 28;11:402. doi: 10.1186/1471-2105-11-402.

UbiSite: incorporating two-layered machine learning method with substrate motifs to predict ubiquitin-conjugation site on lysines.UbiSite：结合具有底物基序的两层机器学习方法来预测赖氨酸上的泛素结合位点。

BMC Syst Biol. 2016 Jan 11;10 Suppl 1(Suppl 1):6. doi: 10.1186/s12918-015-0246-z.

PAIRpred: partner-specific prediction of interacting residues from sequence and structure.PAIRpred：基于序列和结构的相互作用残基的特定伙伴预测。

Proteins. 2014 Jul;82(7):1142-55. doi: 10.1002/prot.24479. Epub 2013 Dec 6.

Protein-Protein Interaction Interface Residue Pair Prediction Based on Deep Learning Architecture.基于深度学习架构的蛋白质-蛋白质相互作用界面残基对预测。

IEEE/ACM Trans Comput Biol Bioinform. 2019 Sep-Oct;16(5):1753-1759. doi: 10.1109/TCBB.2017.2706682. Epub 2017 May 19.

Residue-level prediction of DNA-binding sites and its application on DNA-binding protein predictions.DNA结合位点的残基水平预测及其在DNA结合蛋白预测中的应用。

FEBS Lett. 2007 Mar 6;581(5):1058-66. doi: 10.1016/j.febslet.2007.01.086. Epub 2007 Feb 7.

Exploiting residue-level and profile-level interface propensities for usage in binding sites prediction of proteins.利用残基水平和序列轮廓水平的界面倾向用于蛋白质结合位点预测。

BMC Bioinformatics. 2007 May 5;8:147. doi: 10.1186/1471-2105-8-147.

Binding interface prediction by combining protein-protein docking results.通过结合蛋白质-蛋白质对接结果进行结合界面预测。

Proteins. 2014 Jan;82(1):57-66. doi: 10.1002/prot.24354. Epub 2013 Aug 31.

Detection of outlier residues for improving interface prediction in protein heterocomplexes.检测异常残基以改善蛋白质杂合体界面预测。

IEEE/ACM Trans Comput Biol Bioinform. 2012 Jul-Aug;9(4):1155-65. doi: 10.1109/TCBB.2012.58.

PalmPred: an SVM based palmitoylation prediction method using sequence profile information.PalmPred：一种基于支持向量机的利用序列轮廓信息的棕榈酰化预测方法。

PLoS One. 2014 Feb 19;9(2):e89246. doi: 10.1371/journal.pone.0089246. eCollection 2014.

引用本文的文献

Biochemical and physiological characterization of Aedes aegypti midgut chymotrypsin.埃及伊蚊中肠胰凝乳蛋白酶的生化与生理特性

Sci Rep. 2025 Mar 20;15(1):9685. doi: 10.1038/s41598-025-93413-7.

Protein-protein and protein-nucleic acid binding site prediction via interpretable hierarchical geometric deep learning.通过可解释的分层几何深度学习进行蛋白质-蛋白质和蛋白质-核酸结合位点预测。

Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giae080.

E(3) equivariant graph neural networks for robust and accurate protein-protein interaction site prediction.E(3)等变图神经网络用于稳健和准确的蛋白质-蛋白质相互作用位点预测。

PLoS Comput Biol. 2023 Aug 31;19(8):e1011435. doi: 10.1371/journal.pcbi.1011435. eCollection 2023 Aug.

A capsule network-based method for identifying transcription factors.一种基于胶囊网络的转录因子识别方法。

Front Microbiol. 2022 Dec 6;13:1048478. doi: 10.3389/fmicb.2022.1048478. eCollection 2022.

Prediction of Protein-Protein Interaction Sites by Multifeature Fusion and RF with mRMR and IFS.基于 mRMR 和 IFS 的多特征融合和 RF 预测蛋白质-蛋白质相互作用位点。

Dis Markers. 2022 Oct 4;2022:5892627. doi: 10.1155/2022/5892627. eCollection 2022.

ProB-Site: Protein Binding Site Prediction Using Local Features.ProB-Site：使用局部特征预测蛋白质结合位点。

Cells. 2022 Jul 5;11(13):2117. doi: 10.3390/cells11132117.

Deep Learning for Protein-Protein Interaction Site Prediction.用于蛋白质-蛋白质相互作用位点预测的深度学习

Methods Mol Biol. 2021;2361:263-288. doi: 10.1007/978-1-0716-1641-3_16.

iPNHOT: a knowledge-based approach for identifying protein-nucleic acid interaction hot spots.

BMC Bioinformatics. 2020 Jul 6;21(1):289. doi: 10.1186/s12859-020-03636-w.

Protein-protein interaction sites prediction by ensemble random forests with synthetic minority oversampling technique.基于集成随机森林和合成少数过采样技术的蛋白质-蛋白质相互作用位点预测。

Bioinformatics. 2019 Jul 15;35(14):2395-2402. doi: 10.1093/bioinformatics/bty995.

Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network.基于堆叠双向递归神经网络的蛋白质溶剂可及性预测。

Biomolecules. 2018 May 25;8(2):33. doi: 10.3390/biom8020033.

本文引用的文献

The universal protein resource (UniProt).通用蛋白质资源（UniProt）。

Nucleic Acids Res. 2008 Jan;36(Database issue):D190-5. doi: 10.1093/nar/gkm895. Epub 2007 Nov 27.

Docking and scoring protein complexes: CAPRI 3rd Edition.蛋白质复合物对接与评分：CAPRI第3版。

Proteins. 2007 Dec 1;69(4):704-18. doi: 10.1002/prot.21804.

A holistic approach to protein docking.蛋白质对接的整体方法。

Proteins. 2007 Dec 1;69(4):743-9. doi: 10.1002/prot.21752.

SOFTDOCK application to protein-protein interaction benchmark and CAPRI.SOFTDOCK在蛋白质-蛋白质相互作用基准测试和蛋白质-蛋白质相互作用预测评估中的应用

Proteins. 2007 Dec 1;69(4):801-8. doi: 10.1002/prot.21728.

An automated decision-tree approach to predicting protein interaction hot spots.一种用于预测蛋白质相互作用热点的自动化决策树方法。

Proteins. 2007 Sep 1;68(4):813-23. doi: 10.1002/prot.21474.

PI2PE: protein interface/interior prediction engine.PI2PE：蛋白质界面/内部预测引擎。

Nucleic Acids Res. 2007 Jul;35(Web Server issue):W357-62. doi: 10.1093/nar/gkm231. Epub 2007 May 25.

PIER: protein interface recognition for structural proteomics.PIER：用于结构蛋白质组学的蛋白质界面识别

Proteins. 2007 May 1;67(2):400-17. doi: 10.1002/prot.21233.

Physicochemical descriptors to discriminate protein-protein interactions in permanent and transient complexes selected by means of machine learning algorithms.通过机器学习算法选择的用于区分永久性和瞬时复合物中蛋白质-蛋白质相互作用的物理化学描述符。

Proteins. 2006 Nov 15;65(3):607-22. doi: 10.1002/prot.21104.

Insights into protein-protein interfaces using a Bayesian network prediction method.使用贝叶斯网络预测方法洞察蛋白质-蛋白质相互作用界面

J Mol Biol. 2006 Sep 15;362(2):365-86. doi: 10.1016/j.jmb.2006.07.028. Epub 2006 Jul 21.

Rank information: a structure-independent measure of evolutionary trace quality that improves identification of protein functional sites.排名信息：一种与结构无关的进化踪迹质量度量，可改进蛋白质功能位点的识别。

Proteins. 2006 Oct 1;65(1):111-23. doi: 10.1002/prot.21101.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用核心界面残基和支持向量机预测蛋白质-蛋白质结合位点

Prediction of protein-protein binding site by using core interface residue and support vector machine.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献