Suppr超能文献

选择用于预测蛋白质-蛋白质相互作用的负样本。

Choosing negative examples for the prediction of protein-protein interactions.

作者信息

Ben-Hur Asa, Noble William Stafford

机构信息

Department of Computer Science, Colorado State University, Fort Collins CO, USA.

出版信息

BMC Bioinformatics. 2006 Mar 20;7 Suppl 1(Suppl 1):S2. doi: 10.1186/1471-2105-7-S1-S2.

Abstract

The protein-protein interaction networks of even well-studied model organisms are sketchy at best, highlighting the continued need for computational methods to help direct experimentalists in the search for novel interactions. This need has prompted the development of a number of methods for predicting protein-protein interactions based on various sources of data and methodologies. The common method for choosing negative examples for training a predictor of protein-protein interactions is based on annotations of cellular localization, and the observation that pairs of proteins that have different localization patterns are unlikely to interact. While this method leads to high quality sets of non-interacting proteins, we find that this choice can lead to biased estimates of prediction accuracy, because the constraints placed on the distribution of the negative examples makes the task easier. The effects of this bias are demonstrated in the context of both sequence-based and non-sequence based features used for predicting protein-protein interactions.

摘要

即使是研究充分的模式生物,其蛋白质-蛋白质相互作用网络也充其量只是粗略的,这凸显了持续需要计算方法来帮助指导实验人员寻找新的相互作用。这种需求促使人们基于各种数据来源和方法开发了许多预测蛋白质-蛋白质相互作用的方法。为训练蛋白质-蛋白质相互作用预测器选择阴性示例的常用方法基于细胞定位注释,以及具有不同定位模式的蛋白质对不太可能相互作用的观察结果。虽然这种方法能得到高质量的非相互作用蛋白质集,但我们发现这种选择可能导致对预测准确性的偏差估计,因为对阴性示例分布施加的限制使任务变得更容易。这种偏差的影响在用于预测蛋白质-蛋白质相互作用的基于序列和非基于序列的特征的背景下都得到了证明。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b9aa/1810313/788dcfeb67af/1471-2105-7-S1-S2-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验