Garzón José Ignacio, Deng Lei, Murray Diana, Shapira Sagi, Petrey Donald, Honig Barry
Center for Computational Biology and Bioinformatics, Department of Systems Biology, Columbia University, New York, United States.
School of Software, Central South University, Changsha, China.
Elife. 2016 Oct 22;5:e18715. doi: 10.7554/eLife.18715.
We present a database, PrePPI (Predicting Protein-Protein Interactions), of more than 1.35 million predicted protein-protein interactions (PPIs). Of these at least 127,000 are expected to constitute direct physical interactions although the actual number may be much larger (~500,000). The current PrePPI, which contains predicted interactions for about 85% of the human proteome, is related to an earlier version but is based on additional sources of interaction evidence and is far larger in scope. The use of structural relationships allows PrePPI to infer numerous previously unreported interactions. PrePPI has been subjected to a series of validation tests including reproducing known interactions, recapitulating multi-protein complexes, analysis of disease associated SNPs, and identifying functional relationships between interacting proteins. We show, using Gene Set Enrichment Analysis (GSEA), that predicted interaction partners can be used to annotate a protein's function. We provide annotations for most human proteins, including many annotated as having unknown function.
我们展示了一个名为PrePPI(预测蛋白质-蛋白质相互作用)的数据库,其中包含超过135万个预测的蛋白质-蛋白质相互作用(PPI)。其中至少12.7万个预计构成直接的物理相互作用,尽管实际数量可能更多(约50万个)。当前的PrePPI包含了约85%的人类蛋白质组的预测相互作用,它与早期版本相关,但基于更多的相互作用证据来源,范围也大得多。利用结构关系使PrePPI能够推断出许多以前未报道的相互作用。PrePPI已经经过了一系列验证测试,包括重现已知相互作用、概括多蛋白复合物、分析与疾病相关的单核苷酸多态性以及识别相互作用蛋白质之间的功能关系。我们使用基因集富集分析(GSEA)表明,预测的相互作用伙伴可用于注释蛋白质的功能。我们为大多数人类蛋白质提供了注释,包括许多被注释为功能未知的蛋白质。