Suppr超能文献

一种用于预测蛋白质 - 肽相互作用的正则化判别模型。

A regularized discriminative model for the prediction of protein-peptide interactions.

作者信息

Lehrach Wolfgang P, Husmeier Dirk, Williams Christopher K I

机构信息

University of Edinburgh, Edinburgh EH1 2QL, UK.

出版信息

Bioinformatics. 2006 Mar 1;22(5):532-40. doi: 10.1093/bioinformatics/bti804. Epub 2006 Jan 5.

Abstract

MOTIVATION

Short well-defined domains known as peptide recognition modules (PRMs) regulate many important protein-protein interactions involved in the formation of macromolecular complexes and biochemical pathways. Since high-throughput experiments like yeast two-hybrid and phage display are expensive and intrinsically noisy, it would be desirable to more specifically target or partially bypass them with complementary in silico approaches. In the present paper, we present a probabilistic discriminative approach to predicting PRM-mediated protein-protein interactions from sequence data. The model is motivated by the discriminative model of Segal and Sharan as an alternative to the generative approach of Reiss and Schwikowski. In our evaluation, we focus on predicting the interaction network. As proposed by Williams, we overcome the problem of susceptibility to over-fitting by adopting a Bayesian a posteriori approach based on a Laplacian prior in parameter space.

RESULTS

The proposed method was tested on two datasets of protein-protein interactions involving 28 SH3 domain proteins in Saccharmomyces cerevisiae, where the datasets were obtained with different experimental techniques. The predictions were evaluated with out-of-sample receiver operator characteristic (ROC) curves. In both cases, Laplacian regularization turned out to be crucial for achieving a reasonable generalization performance. The Laplacian-regularized discriminative model outperformed the generative model of Reiss and Schwikowski in terms of the area under the ROC curve on both datasets. The performance was further improved with a hybrid approach, in which our model was initialized with the motifs obtained with the method of Reiss and Schwikowski.

AVAILABILITY

Software and supplementary material is available from http://lehrach.com/wolfgang/dmf.

摘要

动机

被称为肽识别模块(PRM)的短的、定义明确的结构域调节着许多参与大分子复合物形成和生化途径的重要蛋白质-蛋白质相互作用。由于诸如酵母双杂交和噬菌体展示等高通量实验成本高昂且本质上存在噪声,因此期望通过互补的计算机方法更有针对性地靶向或部分绕过这些实验。在本文中,我们提出了一种概率判别方法,用于从序列数据预测PRM介导的蛋白质-蛋白质相互作用。该模型的灵感来自于Segal和Sharan的判别模型,作为Reiss和Schwikowski生成方法的替代方案。在我们的评估中,我们专注于预测相互作用网络。正如Williams所提出的,我们通过在参数空间中采用基于拉普拉斯先验的贝叶斯后验方法来克服过拟合的敏感性问题。

结果

所提出的方法在涉及酿酒酵母中28个SH3结构域蛋白的两个蛋白质-蛋白质相互作用数据集上进行了测试,其中数据集是通过不同的实验技术获得的。预测结果通过样本外接收器操作特征(ROC)曲线进行评估。在这两种情况下,拉普拉斯正则化对于实现合理的泛化性能至关重要。在两个数据集上ROC曲线下面积方面,拉普拉斯正则化判别模型优于Reiss和Schwikowski的生成模型。通过一种混合方法进一步提高了性能,在该方法中,我们的模型用Reiss和Schwikowski方法获得的基序进行初始化。

可用性

软件和补充材料可从http://lehrach.com/wolfgang/dmf获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验