Department of Systems and Computer Engineering, Carleton University, Ottawa, Ontario, K1S 5B6, Canada.
Agriculture and Agri-Food Canada, Ottawa Research and Development Centre, Ottawa, Ontario, K1A 0C6, Canada.
Sci Rep. 2020 Jan 29;10(1):1390. doi: 10.1038/s41598-019-56895-w.
The need for larger-scale and increasingly complex protein-protein interaction (PPI) prediction tasks demands that state-of-the-art predictors be highly efficient and adapted to inter- and cross-species predictions. Furthermore, the ability to generate comprehensive interactomes has enabled the appraisal of each PPI in the context of all predictions leading to further improvements in classification performance in the face of extreme class imbalance using the Reciprocal Perspective (RP) framework. We here describe the PIPE4 algorithm. Adaptation of the PIPE3/MP-PIPE sequence preprocessing step led to upwards of 50x speedup and the new Similarity Weighted Score appropriately normalizes for window frequency when applied to any inter- and cross-species prediction schemas. Comprehensive interactomes for three prediction schemas are generated: (1) cross-species predictions, where Arabidopsis thaliana is used as a proxy to predict the comprehensive Glycine max interactome, (2) inter-species predictions between Homo sapiens-HIV1, and (3) a combined schema involving both cross- and inter-species predictions, where both Arabidopsis thaliana and Caenorhabditis elegans are used as proxy species to predict the interactome between Glycine max (the soybean legume) and Heterodera glycines (the soybean cyst nematode). Comparing PIPE4 with the state-of-the-art resulted in improved performance, indicative that it should be the method of choice for complex PPI prediction schemas.
对于更大规模和日益复杂的蛋白质-蛋白质相互作用(PPI)预测任务的需求,要求最先进的预测器具有高效率,并适应种间和种内预测。此外,生成全面的互作组的能力使得能够在所有预测的背景下评估每个 PPI,从而使用 Reciprocal Perspective(RP)框架在面对极端类不平衡时进一步提高分类性能。我们在这里描述 PIPE4 算法。对 PIPE3/MP-PIPE 序列预处理步骤的适应导致速度提高了 50 倍以上,并且新的相似性加权得分在应用于任何种间和种内预测方案时适当归一化为窗口频率。为三种预测方案生成了全面的互作组:(1)种间预测,其中拟南芥被用作代理来预测全面的大豆互作组,(2)人类-艾滋病毒 1 之间的种间预测,以及(3)涉及种间和种内预测的组合方案,其中拟南芥和秀丽隐杆线虫都被用作代理物种来预测大豆(豆科植物)和大豆胞囊线虫(大豆胞囊线虫)之间的互作组。将 PIPE4 与最先进的方法进行比较,结果表明其性能得到了提高,这表明它应该是复杂 PPI 预测方案的首选方法。