蛋白酶抑制剂与抗原-抗体复合物中界面残基的鉴定：一种支持向量机方法。

Identification of interface residues in protease-inhibitor and antigen-antibody complexes: a support vector machine approach.

作者信息

Yan Changhui, Honavar Vasant, Dobbs Drena

机构信息

Artificial Intelligence Research Laboratory, Iowa State University, Atanasoff Hall 226, Ames, IA 50011-1040, USA.

出版信息

Neural Comput Appl. 2004 Jun 1;13(2):123-129. doi: 10.1007/s00521-004-0414-3.

DOI:10.1007/s00521-004-0414-3

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2880521/

Abstract

In this paper, we describe a machine learning approach for sequence-based prediction of protein-protein interaction sites. A support vector machine (SVM) classifier was trained to predict whether or not a surface residue is an interface residue (i.e., is located in the protein-protein interaction surface), based on the identity of the target residue and its ten sequence neighbors. Separate classifiers were trained on proteins from two categories of complexes, antibody-antigen and protease-inhibitor. The effectiveness of each classifier was evaluated using leave-one-out (jack-knife) cross-validation. Interface and non-interface residues were classified with relatively high sensitivity (82.3% and 78.5%) and specificity (81.0% and 77.6%) for proteins in the antigen-antibody and protease-inhibitor complexes, respectively. The correlation between predicted and actual labels was 0.430 and 0.462, indicating that the method performs substantially better than chance (zero correlation). Combined with recently developed methods for identification of surface residues from sequence information, this offers a promising approach to predict residues involved in protein-protein interactions from sequence information alone.

摘要

在本文中，我们描述了一种基于序列预测蛋白质-蛋白质相互作用位点的机器学习方法。训练了一个支持向量机（SVM）分类器，以根据目标残基及其十个序列邻域的同一性来预测表面残基是否为界面残基（即位于蛋白质-蛋白质相互作用表面）。针对来自抗体-抗原和蛋白酶-抑制剂两类复合物的蛋白质分别训练了分类器。使用留一法（刀切法）交叉验证评估每个分类器的有效性。对于抗原-抗体和蛋白酶-抑制剂复合物中的蛋白质，界面残基和非界面残基的分类分别具有相对较高的灵敏度（82.3%和78.5%）和特异性（81.0%和77.6%）。预测标签与实际标签之间的相关性分别为0.430和0.462，表明该方法的性能明显优于随机猜测（零相关性）。结合最近开发的从序列信息中识别表面残基的方法，这为仅从序列信息预测参与蛋白质-蛋白质相互作用的残基提供了一种有前景的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81e8/2880521/e7a18c6d6eb3/nihms67534f1.jpg

相似文献

1

Identification of interface residues in protease-inhibitor and antigen-antibody complexes: a support vector machine approach.蛋白酶抑制剂与抗原-抗体复合物中界面残基的鉴定：一种支持向量机方法。

Neural Comput Appl. 2004 Jun 1;13(2):123-129. doi: 10.1007/s00521-004-0414-3.

2

Predicting DNA-binding sites of proteins from amino acid sequence.从氨基酸序列预测蛋白质的DNA结合位点。

BMC Bioinformatics. 2006 May 19;7:262. doi: 10.1186/1471-2105-7-262.

3

Protein-RNA interface residue prediction using machine learning: an assessment of the state of the art.基于机器学习的蛋白质-RNA 界面残基预测：现状评估。

BMC Bioinformatics. 2012 May 10;13:89. doi: 10.1186/1471-2105-13-89.

4

Using support vector machine combined with post-processing procedure to improve prediction of interface residues in transient complexes.利用支持向量机结合后处理程序提高瞬态复合物界面残基预测。

Protein J. 2009 Oct;28(7-8):369-74. doi: 10.1007/s10930-009-9203-2.

5

Development of a machine learning method to predict membrane protein-ligand binding residues using basic sequence information.利用基本序列信息开发一种预测膜蛋白-配体结合残基的机器学习方法。

Adv Bioinformatics. 2015;2015:843030. doi: 10.1155/2015/843030. Epub 2015 Jan 31.

6

Prediction of protein-protein binding site by using core interface residue and support vector machine.利用核心界面残基和支持向量机预测蛋白质-蛋白质结合位点

BMC Bioinformatics. 2008 Dec 22;9:553. doi: 10.1186/1471-2105-9-553.

7

Sequence-based identification of interface residues by an integrative profile combining hydrophobic and evolutionary information.基于序列的界面残基识别方法，整合了疏水作用和进化信息的综合轮廓。

BMC Bioinformatics. 2010 Jul 28;11:402. doi: 10.1186/1471-2105-11-402.

8

Techniques for Developing Reliable Machine Learning Classifiers Applied to Understanding and Predicting Protein:Protein Interaction Hot Spots.开发可靠的机器学习分类器的技术应用于理解和预测蛋白质：蛋白质相互作用热点。

Methods Mol Biol. 2024;2714:235-268. doi: 10.1007/978-1-0716-3441-7_14.

9

A two-stage classifier for identification of protein-protein interface residues.一种用于识别蛋白质-蛋白质界面残基的两阶段分类器。

Bioinformatics. 2004 Aug 4;20 Suppl 1:i371-8. doi: 10.1093/bioinformatics/bth920.

10

Glycosylation site prediction using ensembles of Support Vector Machine classifiers.使用支持向量机分类器集成进行糖基化位点预测。

BMC Bioinformatics. 2007 Nov 9;8:438. doi: 10.1186/1471-2105-8-438.

引用本文的文献

1

jEcho: an Evolved weight vector to CHaracterize the protein's posttranslational modification mOtifs.jEcho：一种进化的权重向量，用于描述蛋白质的翻译后修饰模体。

Interdiscip Sci. 2015 Jun;7(2):194-9. doi: 10.1007/s12539-015-0260-2. Epub 2015 Aug 6.

2

A local average distance descriptor for flexible protein structure comparison.一种用于柔性蛋白质结构比较的局部平均距离描述符。

BMC Bioinformatics. 2014 Apr 2;15:95. doi: 10.1186/1471-2105-15-95.

3

Evaluation of features for catalytic residue prediction in novel folds.新型折叠中催化残基预测特征的评估。

Protein Sci. 2007 Feb;16(2):216-26. doi: 10.1110/ps.062523907. Epub 2006 Dec 22.

4

Prediction of RNA binding sites in proteins from amino acid sequence.从氨基酸序列预测蛋白质中的RNA结合位点。

RNA. 2006 Aug;12(8):1450-62. doi: 10.1261/rna.2197306. Epub 2006 Jun 21.

5

Predicting DNA-binding sites of proteins from amino acid sequence.从氨基酸序列预测蛋白质的DNA结合位点。

BMC Bioinformatics. 2006 May 19;7:262. doi: 10.1186/1471-2105-7-262.

6

Predicting binding sites of hydrolase-inhibitor complexes by combining several methods.通过结合多种方法预测水解酶-抑制剂复合物的结合位点。

BMC Bioinformatics. 2004 Dec 17;5:205. doi: 10.1186/1471-2105-5-205.

本文引用的文献

1

Predicted protein-protein interaction sites from local sequence information.基于局部序列信息预测的蛋白质-蛋白质相互作用位点。

FEBS Lett. 2003 Jun 5;544(1-3):236-9. doi: 10.1016/s0014-5793(03)00456-3.

2

Development of unified statistical potentials describing protein-protein interactions.描述蛋白质-蛋白质相互作用的统一统计势的开发。

Biophys J. 2003 Mar;84(3):1895-901. doi: 10.1016/S0006-3495(03)74997-2.

3

Analysing six types of protein-protein interfaces.分析六种蛋白质-蛋白质相互作用界面。

J Mol Biol. 2003 Jan 10;325(2):377-87. doi: 10.1016/s0022-2836(02)01223-8.

4

Computational methods for the prediction of protein interactions.预测蛋白质相互作用的计算方法。

Curr Opin Struct Biol. 2002 Jun;12(3):368-73. doi: 10.1016/s0959-440x(02)00333-0.

5

Dissecting protein-protein recognition sites.剖析蛋白质-蛋白质识别位点。

Proteins. 2002 May 15;47(3):334-43. doi: 10.1002/prot.10085.

6

Prediction of protein--protein interaction sites in heterocomplexes with neural networks.利用神经网络预测异源复合物中的蛋白质-蛋白质相互作用位点。

Eur J Biochem. 2002 Mar;269(5):1356-61. doi: 10.1046/j.1432-1033.2002.02767.x.

7

Prediction of protein interaction sites from sequence profile and residue neighbor list.基于序列概况和残基邻域列表预测蛋白质相互作用位点。

Proteins. 2001 Aug 15;44(3):336-43. doi: 10.1002/prot.1099.

8

Determination of protein function, evolution and interactions by structural genomics.通过结构基因组学确定蛋白质的功能、进化及相互作用。

Curr Opin Struct Biol. 2001 Jun;11(3):354-63. doi: 10.1016/s0959-440x(00)00215-3.

9

Residue frequencies and pairing preferences at protein-protein interfaces.蛋白质-蛋白质界面处的残基频率和配对偏好。

Proteins. 2001 May 1;43(2):89-102.

10

Prediction of protein surface accessibility with information theory.基于信息论的蛋白质表面可及性预测

Proteins. 2001 Mar 1;42(4):452-9. doi: 10.1002/1097-0134(20010301)42:4<452::aid-prot40>3.0.co;2-q.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验