Suppr超能文献

PEPPI:通过结构和序列相似性、功能关联和机器学习进行全蛋白质蛋白质相互作用预测。

PEPPI: Whole-proteome Protein-protein Interaction Prediction through Structure and Sequence Similarity, Functional Association, and Machine Learning.

机构信息

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA.

出版信息

J Mol Biol. 2022 Jun 15;434(11):167530. doi: 10.1016/j.jmb.2022.167530. Epub 2022 Mar 5.

Abstract

Proteome-wide identification of protein-protein interactions is a formidable task which has yet to be sufficiently addressed by experimental methodologies. Many computational methods have been developed to predict proteome-wide interaction networks, but few leverage both the sensitivity of structural information and the wide availability of sequence data. We present PEPPI, a pipeline which integrates structural similarity, sequence similarity, functional association data, and machine learning-based classification through a naïve Bayesian classifier model to accurately predict protein-protein interactions at a proteomic scale. Through benchmarking against a set of 798 ground truth interactions and an equal number of non-interactions, we have found that PEPPI attains 4.5% higher AUROC than the best of other state-of-the-art methods. As a proteomic-scale application, PEPPI was applied to model the interactions which occur between SARS-CoV-2 and human host cells during coronavirus infection, where 403 high-confidence interactions were identified with predictions covering 73% of a gold standard dataset from PSICQUIC and demonstrating significant complementarity with the most recent high-throughput experiments. PEPPI is available both as a webserver and in a standalone version and should be a powerful and generally applicable tool for computational screening of protein-protein interactions.

摘要

蛋白质组范围内的蛋白质-蛋白质相互作用的鉴定是一项艰巨的任务,目前还没有足够的实验方法来解决。已经开发了许多计算方法来预测蛋白质组范围内的相互作用网络,但很少有方法能够利用结构信息的敏感性和广泛可用的序列数据。我们提出了 PEPPI,这是一个集成了结构相似性、序列相似性、功能关联数据和基于机器学习的分类的管道,通过朴素贝叶斯分类器模型,可以在蛋白质组范围内准确预测蛋白质-蛋白质相互作用。通过与一组 798 个真实相互作用和数量相等的非相互作用进行基准测试,我们发现 PEPPI 的 AUROC 比其他最先进方法中的最佳方法高 4.5%。作为一种蛋白质组规模的应用,PEPPI 被应用于模拟 SARS-CoV-2 和人类宿主细胞在冠状病毒感染过程中发生的相互作用,其中鉴定了 403 个高可信度的相互作用,预测覆盖了 PSICQUIC 的黄金标准数据集的 73%,并与最近的高通量实验表现出显著的互补性。PEPPI 既提供了网络服务器版本,也提供了独立版本,应该是计算筛选蛋白质-蛋白质相互作用的强大而通用的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5cba/8897833/b342e0ebd9d4/ga1_lrg.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验