基于特征选择和支持向量机的域剖面预测域-域相互作用。

Predicting domain-domain interaction based on domain profiles with feature selection and support vector machines.

机构信息

Department of Computer and Information Sciences, University of Delaware 421 Smith Hall, Newark, DE 19716, USA.

出版信息

BMC Bioinformatics. 2010 Oct 29;11:537. doi: 10.1186/1471-2105-11-537.

DOI:10.1186/1471-2105-11-537

PMID:21034480

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2989984/

Abstract

BACKGROUND

Protein-protein interaction (PPI) plays essential roles in cellular functions. The cost, time and other limitations associated with the current experimental methods have motivated the development of computational methods for predicting PPIs. As protein interactions generally occur via domains instead of the whole molecules, predicting domain-domain interaction (DDI) is an important step toward PPI prediction. Computational methods developed so far have utilized information from various sources at different levels, from primary sequences, to molecular structures, to evolutionary profiles.

RESULTS

In this paper, we propose a computational method to predict DDI using support vector machines (SVMs), based on domains represented as interaction profile hidden Markov models (ipHMM) where interacting residues in domains are explicitly modeled according to the three dimensional structural information available at the Protein Data Bank (PDB). Features about the domains are extracted first as the Fisher scores derived from the ipHMM and then selected using singular value decomposition (SVD). Domain pairs are represented by concatenating their selected feature vectors, and classified by a support vector machine trained on these feature vectors. The method is tested by leave-one-out cross validation experiments with a set of interacting protein pairs adopted from the 3DID database. The prediction accuracy has shown significant improvement as compared to InterPreTS (Interaction Prediction through Tertiary Structure), an existing method for PPI prediction that also uses the sequences and complexes of known 3D structure.

CONCLUSIONS

We show that domain-domain interaction prediction can be significantly enhanced by exploiting information inherent in the domain profiles via feature selection based on Fisher scores, singular value decomposition and supervised learning based on support vector machines. Datasets and source code are freely available on the web at http://liao.cis.udel.edu/pub/svdsvm. Implemented in Matlab and supported on Linux and MS Windows.

摘要

背景

蛋白质-蛋白质相互作用（PPI）在细胞功能中起着至关重要的作用。当前实验方法的成本、时间和其他限制因素促使人们开发了用于预测 PPI 的计算方法。由于蛋白质相互作用通常通过结构域而不是整个分子发生，因此预测结构域-结构域相互作用（DDI）是预测 PPI 的重要步骤。迄今为止开发的计算方法已经利用了来自不同来源的各种信息，从一级序列到分子结构，再到进化概况。

结果

在本文中，我们提出了一种使用支持向量机（SVM）预测 DDI 的计算方法，该方法基于表示为交互轮廓隐马尔可夫模型（ipHMM）的结构域，其中结构域中的相互作用残基根据可从蛋白质数据库（PDB）获得的三维结构信息进行了明确建模。首先提取关于结构域的特征作为从 ipHMM 导出的 Fisher 分数，然后使用奇异值分解（SVD）进行选择。通过将其选择的特征向量连接在一起来表示结构域对，并使用基于这些特征向量训练的支持向量机进行分类。该方法通过从 3DID 数据库中采用的一组相互作用蛋白对进行的留一交叉验证实验进行了测试。与 InterPreTS（通过三级结构进行相互作用预测）相比，预测精度有了显著提高，InterPreTS 是一种用于预测 PPI 的现有方法，它也使用已知三维结构的序列和复合物。

结论

我们表明，通过基于 Fisher 分数、奇异值分解和基于支持向量机的监督学习的特征选择，利用结构域轮廓中固有的信息，可以显著增强结构域-结构域相互作用预测。数据集和源代码可在 http://liao.cis.udel.edu/pub/svdsvm 上免费获得。在 Matlab 中实现，支持 Linux 和 MS Windows。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48e8/2989984/0e109333b5f2/1471-2105-11-537-1.jpg

相似文献

Predicting domain-domain interaction based on domain profiles with feature selection and support vector machines.

BMC Bioinformatics. 2010 Oct 29;11:537. doi: 10.1186/1471-2105-11-537.

Prediction of contact matrix for protein-protein interaction.

Bioinformatics. 2013 Apr 15;29(8):1018-25. doi: 10.1093/bioinformatics/btt076. Epub 2013 Feb 15.

PPI_SVM: prediction of protein-protein interactions using machine learning, domain-domain affinities and frequency tables.

Cell Mol Biol Lett. 2011 Jun;16(2):264-78. doi: 10.2478/s11658-011-0008-x. Epub 2011 Mar 20.

A computational model for predicting protein interactions based on multidomain collaboration.

IEEE/ACM Trans Comput Biol Bioinform. 2012 Jul-Aug;9(4):1081-90. doi: 10.1109/TCBB.2012.55.

Homology-based prediction of interactions between proteins using Averaged One-Dependence Estimators.

BMC Bioinformatics. 2014 Jun 23;15:213. doi: 10.1186/1471-2105-15-213.

RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences.

Int J Mol Sci. 2016 May 18;17(5):757. doi: 10.3390/ijms17050757.

Stringent DDI-based prediction of H. sapiens-M. tuberculosis H37Rv protein-protein interactions.

BMC Syst Biol. 2013;7 Suppl 6(Suppl 6):S6. doi: 10.1186/1752-0509-7-S6-S6. Epub 2013 Dec 13.

GAIA: a gram-based interaction analysis tool--an approach for identifying interacting domains in yeast.

BMC Bioinformatics. 2009 Jan 30;10 Suppl 1(Suppl 1):S60. doi: 10.1186/1471-2105-10-S1-S60.

Prediction of protein binding sites in protein structures using hidden Markov support vector machine.

BMC Bioinformatics. 2009 Nov 20;10:381. doi: 10.1186/1471-2105-10-381.

Prediction of protein-protein interaction strength using domain features with supervised regression.

ScientificWorldJournal. 2014;2014:240673. doi: 10.1155/2014/240673. Epub 2014 Jun 24.

引用本文的文献

A Computational Predictor for Accurate Identification of Tumor Homing Peptides by Integrating Sequential and Deep BiLSTM Features.

Interdiscip Sci. 2024 Jun;16(2):503-518. doi: 10.1007/s12539-024-00628-9. Epub 2024 May 11.

Completing sparse and disconnected protein-protein network by deep learning.

BMC Bioinformatics. 2018 Mar 22;19(1):103. doi: 10.1186/s12859-018-2112-7.

Enhancing interacting residue prediction with integrated contact matrix prediction in protein-protein interaction.

EURASIP J Bioinform Syst Biol. 2016 Oct 22;2016(1):17. doi: 10.1186/s13637-016-0051-z. eCollection 2016 Dec.

Inference of protein-protein interaction networks from multiple heterogeneous data.

EURASIP J Bioinform Syst Biol. 2016 Feb 19;2016(1):8. doi: 10.1186/s13637-016-0040-2. eCollection 2016 Dec.

Prediction of contact matrix for protein-protein interaction.

Bioinformatics. 2013 Apr 15;29(8):1018-25. doi: 10.1093/bioinformatics/btt076. Epub 2013 Feb 15.

ProDis-ContSHC: learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrieval.

BMC Bioinformatics. 2012 May 8;13 Suppl 7(Suppl 7):S2. doi: 10.1186/1471-2105-13-S7-S2.

本文引用的文献

Evaluation of different domain-based methods in protein interaction prediction.

Biochem Biophys Res Commun. 2009 Dec 18;390(3):357-62. doi: 10.1016/j.bbrc.2009.09.130. Epub 2009 Oct 2.

Physical protein-protein interactions predicted from microarrays.

Bioinformatics. 2008 Nov 15;24(22):2608-14. doi: 10.1093/bioinformatics/btn498. Epub 2008 Oct 1.

3D-partner: a web server to infer interacting partners and binding models.

Nucleic Acids Res. 2007 Jul;35(Web Server issue):W561-7. doi: 10.1093/nar/gkm346. Epub 2007 May 21.

Predicting protein-protein interactions based only on sequences information.

Proc Natl Acad Sci U S A. 2007 Mar 13;104(11):4337-41. doi: 10.1073/pnas.0607879104. Epub 2007 Mar 5.

Phylogenetic tree information aids supervised learning for predicting protein-protein interaction based on distance matrices.

BMC Bioinformatics. 2007 Jan 9;8:6. doi: 10.1186/1471-2105-8-6.

Evolutionary conservation of domain-domain interactions.

Genome Biol. 2006;7(12):R125. doi: 10.1186/gb-2006-7-12-r125.

Predicting domain-domain interactions using a parsimony approach.

Genome Biol. 2006;7(11):R104. doi: 10.1186/gb-2006-7-11-r104.

Modelling interaction sites in protein domains with interaction profile hidden Markov models.

Bioinformatics. 2006 Dec 1;22(23):2851-7. doi: 10.1093/bioinformatics/btl486. Epub 2006 Sep 25.

Long-range cooperative binding effects in a T cell receptor variable domain.

Proc Natl Acad Sci U S A. 2006 Jun 27;103(26):9867-72. doi: 10.1073/pnas.0600220103. Epub 2006 Jun 20.

An integrated approach to the prediction of domain-domain interactions.

BMC Bioinformatics. 2006 May 25;7:269. doi: 10.1186/1471-2105-7-269.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于特征选择和支持向量机的域剖面预测域-域相互作用。

Predicting domain-domain interaction based on domain profiles with feature selection and support vector machines.

机构信息

Department of Computer and Information Sciences, University of Delaware 421 Smith Hall, Newark, DE 19716, USA.