• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

蛋白质-蛋白质相互作用预测算法的基准评估。

Benchmark Evaluation of Protein-Protein Interaction Prediction Algorithms.

机构信息

Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15232, USA.

出版信息

Molecules. 2021 Dec 22;27(1):41. doi: 10.3390/molecules27010041.

DOI:10.3390/molecules27010041
PMID:35011283
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8746451/
Abstract

Protein-protein interactions (PPIs) perform various functions and regulate processes throughout cells. Knowledge of the full network of PPIs is vital to biomedical research, but most of the PPIs are still unknown. As it is infeasible to discover all of them experimentally due to technical and resource limitations, computational prediction of PPIs is essential and accurately assessing the performance of algorithms is required before further application or translation. However, many published methods compose their evaluation datasets incorrectly, using a higher proportion of positive class data than occuring naturally, leading to exaggerated performance. We re-implemented various published algorithms and evaluated them on datasets with realistic data compositions and found that their performance is overstated in original publications; with several methods outperformed by our control models built on 'illogical' and random number features. We conclude that these methods are influenced by an over-characterization of some proteins in the literature and due to scale-free nature of PPI network and that they fail when tested on all possible protein pairs. Additionally, we found that sequence-only-based algorithms performed worse than those that employ functional and expression features. We present a benchmark evaluation of many published algorithms for PPI prediction. The source code of our implementations and the benchmark datasets created here are made available in open source.

摘要

蛋白质-蛋白质相互作用(PPIs)在细胞中执行各种功能并调节各种过程。了解完整的 PPI 网络对于生物医学研究至关重要,但大多数 PPI 仍然未知。由于技术和资源的限制,实验上发现所有 PPI 是不切实际的,因此计算预测 PPI 是必要的,并且在进一步应用或翻译之前需要准确评估算法的性能。然而,许多已发表的方法在构建评估数据集时不正确,使用的正类数据比例高于自然发生的比例,从而导致性能被夸大。我们重新实现了各种已发表的算法,并在具有真实数据组成的数据集上对其进行了评估,发现它们在原始出版物中的性能被夸大了;其中一些方法的性能甚至不如我们基于“不合理”和随机数特征构建的对照模型。我们得出的结论是,这些方法受到文献中某些蛋白质过度特征化的影响,以及由于 PPI 网络的无标度性质,当对所有可能的蛋白质对进行测试时,它们会失败。此外,我们发现仅基于序列的算法的性能不如那些利用功能和表达特征的算法。我们对许多用于 PPI 预测的已发表算法进行了基准评估。我们的实现的源代码和这里创建的基准数据集都以开源的形式提供。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78be/8746451/2dba7e90edfe/molecules-27-00041-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78be/8746451/8f484eef4242/molecules-27-00041-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78be/8746451/139656a94ca6/molecules-27-00041-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78be/8746451/2955ebf7e291/molecules-27-00041-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78be/8746451/2dba7e90edfe/molecules-27-00041-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78be/8746451/8f484eef4242/molecules-27-00041-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78be/8746451/139656a94ca6/molecules-27-00041-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78be/8746451/2955ebf7e291/molecules-27-00041-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78be/8746451/2dba7e90edfe/molecules-27-00041-g004.jpg

相似文献

1
Benchmark Evaluation of Protein-Protein Interaction Prediction Algorithms.蛋白质-蛋白质相互作用预测算法的基准评估。
Molecules. 2021 Dec 22;27(1):41. doi: 10.3390/molecules27010041.
2
IIIDB: a database for isoform-isoform interactions and isoform network modules.IIIDB:一个用于异构体-异构体相互作用和异构体网络模块的数据库。
BMC Genomics. 2015;16 Suppl 2(Suppl 2):S10. doi: 10.1186/1471-2164-16-S2-S10. Epub 2015 Jan 21.
3
A Cascade Random Forests Algorithm for Predicting Protein-Protein Interaction Sites.一种用于预测蛋白质-蛋白质相互作用位点的级联随机森林算法。
IEEE Trans Nanobioscience. 2015 Oct;14(7):746-60. doi: 10.1109/TNB.2015.2475359. Epub 2015 Sep 28.
4
Computational prediction of the human-microbial oral interactome.人类口腔微生物相互作用组的计算预测
BMC Syst Biol. 2014 Feb 27;8:24. doi: 10.1186/1752-0509-8-24.
5
LSTM-PHV: prediction of human-virus protein-protein interactions by LSTM with word2vec.LSTM-PHV:基于词向量的 LSTM 预测人类病毒蛋白质相互作用
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab228.
6
ppiPre: predicting protein-protein interactions by combining heterogeneous features.ppiPre:通过结合异构特征预测蛋白质-蛋白质相互作用
BMC Syst Biol. 2013;7 Suppl 2(Suppl 2):S8. doi: 10.1186/1752-0509-7-S2-S8. Epub 2013 Oct 14.
7
MEG-PPIS: a fast protein-protein interaction site prediction method based on multi-scale graph information and equivariant graph neural network.MEG-PPIS:一种基于多尺度图信息和等变图神经网络的快速蛋白质-蛋白质相互作用位点预测方法。
Bioinformatics. 2024 Jan 5;40(5). doi: 10.1093/bioinformatics/btae269.
8
Homology-based prediction of interactions between proteins using Averaged One-Dependence Estimators.基于同源性的蛋白质相互作用预测方法:使用平均单依赖估计。
BMC Bioinformatics. 2014 Jun 23;15:213. doi: 10.1186/1471-2105-15-213.
9
Predicting protein-protein interactions based only on sequences information.仅基于序列信息预测蛋白质-蛋白质相互作用。
Proc Natl Acad Sci U S A. 2007 Mar 13;104(11):4337-41. doi: 10.1073/pnas.0607879104. Epub 2007 Mar 5.
10
A survey on computational models for predicting protein-protein interactions.蛋白质-蛋白质相互作用预测的计算模型研究综述。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab036.

引用本文的文献

1
ESM2_AMP: an interpretable framework for protein-protein interactions prediction and biological mechanism discovery.ESM2_AMP:一种用于蛋白质-蛋白质相互作用预测和生物学机制发现的可解释框架。
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf434.
2
The coming era of proteomics-driven precision medicine.蛋白质组学驱动的精准医学时代即将来临。
Natl Sci Rev. 2025 Jul 14;12(8):nwaf278. doi: 10.1093/nsr/nwaf278. eCollection 2025 Aug.
3
Structural insights into Beclin 1 interactions with it's regulators for autophagy modulation.对Beclin 1与其自噬调节因子相互作用的结构见解。

本文引用的文献

1
Revealing protein-protein interactions at the transcriptome scale by sequencing.通过测序揭示转录组水平的蛋白质-蛋白质相互作用。
Mol Cell. 2021 Oct 7;81(19):4091-4103.e9. doi: 10.1016/j.molcel.2021.07.006. Epub 2021 Aug 3.
2
Towards reproducibility in large-scale analysis of protein-protein interactions.迈向蛋白质-蛋白质相互作用大规模分析中的可重复性。
Nat Methods. 2021 Jul;18(7):720-721. doi: 10.1038/s41592-021-01202-7.
3
: An Ensemble of Deep Autoencoders for Protein-Protein Interaction Prediction.用于蛋白质-蛋白质相互作用预测的深度自动编码器集成
Comput Struct Biotechnol J. 2025 Jul 7;27:3005-3035. doi: 10.1016/j.csbj.2025.06.044. eCollection 2025.
4
Unravelling the human taste receptor interactome: machine learning and molecular modelling insights into protein-protein interactions.解析人类味觉受体相互作用组:机器学习与蛋白质-蛋白质相互作用的分子建模见解
NPJ Sci Food. 2025 Jul 1;9(1):113. doi: 10.1038/s41538-025-00478-9.
5
Topology-driven negative sampling enhances generalizability in protein-protein interaction prediction.拓扑驱动的负采样增强了蛋白质-蛋白质相互作用预测的泛化能力。
Bioinformatics. 2025 May 6;41(5). doi: 10.1093/bioinformatics/btaf148.
6
Predictomes, a classifier-curated database of AlphaFold-modeled protein-protein interactions.预测组,一个由分类器整理的基于AlphaFold建模的蛋白质-蛋白质相互作用的数据库。
Mol Cell. 2025 Mar 20;85(6):1216-1232.e5. doi: 10.1016/j.molcel.2025.01.034. Epub 2025 Feb 26.
7
INTREPPPID-an orthologue-informed quintuplet network for cross-species prediction of protein-protein interaction.INTREPPPID——一种基于直系同源物信息的五联体网络,用于跨物种预测蛋白质-蛋白质相互作用。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae405.
8
Guiding questions to avoid data leakage in biological machine learning applications.指导问题以避免生物机器学习应用中的数据泄露。
Nat Methods. 2024 Aug;21(8):1444-1453. doi: 10.1038/s41592-024-02362-y. Epub 2024 Aug 9.
9
Heterogeneous network approaches to protein pathway prediction.用于蛋白质通路预测的异构网络方法。
Comput Struct Biotechnol J. 2024 Jun 27;23:2727-2739. doi: 10.1016/j.csbj.2024.06.022. eCollection 2024 Dec.
10
Predictomes: A classifier-curated database of AlphaFold-modeled protein-protein interactions.预测组:一个经分类器整理的、基于AlphaFold建模的蛋白质-蛋白质相互作用数据库。
bioRxiv. 2024 Apr 12:2024.04.09.588596. doi: 10.1101/2024.04.09.588596.
Entropy (Basel). 2021 May 21;23(6):643. doi: 10.3390/e23060643.
4
UniProt: the universal protein knowledgebase in 2021.UniProt:2021 年的通用蛋白质知识库。
Nucleic Acids Res. 2021 Jan 8;49(D1):D480-D489. doi: 10.1093/nar/gkaa1100.
5
Protein Interaction Network Reconstruction Through Ensemble Deep Learning With Attention Mechanism.基于注意力机制的集成深度学习蛋白质相互作用网络重建
Front Bioeng Biotechnol. 2020 May 5;8:390. doi: 10.3389/fbioe.2020.00390. eCollection 2020.
6
A reference map of the human binary protein interactome.人类二进制蛋白质相互作用组参考图谱。
Nature. 2020 Apr;580(7803):402-408. doi: 10.1038/s41586-020-2188-x. Epub 2020 Apr 8.
7
Multifaceted protein-protein interaction prediction based on Siamese residual RCNN.基于孪生残差 RCNN 的多功能蛋白质-蛋白质相互作用预测。
Bioinformatics. 2019 Jul 15;35(14):i305-i314. doi: 10.1093/bioinformatics/btz328.
8
Reporting accuracy of rare event classifiers.罕见事件分类器的报告准确性。
NPJ Digit Med. 2018 Oct 10;1:56. doi: 10.1038/s41746-018-0062-0. eCollection 2018.
9
An integration of deep learning with feature embedding for protein-protein interaction prediction.用于蛋白质-蛋白质相互作用预测的深度学习与特征嵌入的集成。
PeerJ. 2019 Jun 17;7:e7126. doi: 10.7717/peerj.7126. eCollection 2019.
10
Predicting protein-protein interactions by fusing various Chou's pseudo components and using wavelet denoising approach.通过融合各种周伪氨基酸组成成分并使用小波去噪方法来预测蛋白质-蛋白质相互作用。
J Theor Biol. 2019 Feb 7;462:329-346. doi: 10.1016/j.jtbi.2018.11.011. Epub 2018 Nov 16.