Department of Systems and Computer Engineering, Carleton University, Ottawa, K1S 5B6, Canada.
Sci Rep. 2018 Aug 3;8(1):11694. doi: 10.1038/s41598-018-30044-1.
All protein-protein interaction (PPI) predictors require the determination of an operational decision threshold when differentiating positive PPIs from negatives. Historically, a single global threshold, typically optimized via cross-validation testing, is applied to all protein pairs. However, we here use data visualization techniques to show that no single decision threshold is suitable for all protein pairs, given the inherent diversity of protein interaction profiles. The recent development of high throughput PPI predictors has enabled the comprehensive scoring of all possible protein-protein pairs. This, in turn, has given rise to context, enabling us now to evaluate a PPI within the context of all possible predictions. Leveraging this context, we introduce a novel modeling framework called Reciprocal Perspective (RP), which estimates a localized threshold on a per-protein basis using several rank order metrics. By considering a putative PPI from the perspective of each of the proteins within the pair, RP rescores the predicted PPI and applies a cascaded Random Forest classifier leading to improvements in recall and precision. We here validate RP using two state-of-the-art PPI predictors, the Protein-protein Interaction Prediction Engine and the Scoring PRotein INTeractions methods, over five organisms: Homo sapiens, Saccharomyces cerevisiae, Arabidopsis thaliana, Caenorhabditis elegans, and Mus musculus. Results demonstrate the application of a post hoc RP rescoring layer significantly improves classification (p < 0.001) in all cases over all organisms and this new rescoring approach can apply to any PPI prediction method.
所有蛋白质-蛋白质相互作用 (PPI) 预测器都需要在区分阳性 PPI 和阴性 PPI 时确定操作决策阈值。传统上,通常通过交叉验证测试进行优化的单个全局阈值适用于所有蛋白质对。然而,我们使用数据可视化技术来表明,鉴于蛋白质相互作用谱的固有多样性,没有单个决策阈值适用于所有蛋白质对。最近高通量 PPI 预测器的发展使得对所有可能的蛋白质-蛋白质对进行全面评分成为可能。这反过来又产生了上下文,使我们现在能够在所有可能的预测的背景下评估 PPI。利用这种上下文,我们引入了一种称为互惠视角 (RP) 的新建模框架,该框架使用几种排序度量在每个蛋白质的基础上估计局部阈值。通过从对偶蛋白对中每一个蛋白质的角度考虑假定的 PPI,RP 重新评分预测的 PPI,并应用级联随机森林分类器,从而提高召回率和精度。我们在这里使用两种最先进的 PPI 预测器,即蛋白质-蛋白质相互作用预测引擎和评分 PRotein INTeractions 方法,在五个生物体上验证 RP:智人、酿酒酵母、拟南芥、秀丽隐杆线虫和小家鼠。结果表明,在后处理 RP 重新评分层的应用显著提高了所有生物体中所有情况下的分类(p < 0.001),并且这种新的重新评分方法可应用于任何 PPI 预测方法。