Max Planck Institute for Informatics, Saarbrücken, Germany.
Mol Cell Proteomics. 2011 Jun;10(6):M110.004929. doi: 10.1074/mcp.M110.004929. Epub 2011 Mar 30.
Recent large-scale data sets of protein complex purifications have provided unprecedented insights into the organization of cellular protein complexes. Several computational methods have been developed to detect co-complexed proteins in these data sets. Their common aim is the identification of biologically relevant protein complexes. However, much less is known about the network of direct physical protein contacts within the detected protein complexes. Therefore, our work investigates whether direct physical contacts can be computationally derived by combining raw data of large-scale protein complex purifications. We assess four established scoring schemes and introduce a new scoring approach that is specifically devised to infer direct physical protein contacts from protein complex purifications. The physical contacts identified by the five methods are comprehensively benchmarked against different reference sets that provide evidence for true physical contacts. Our results show that raw purification data can indeed be exploited to determine high-confidence physical protein contacts within protein complexes. In particular, our new method outperforms competing approaches at discovering physical contacts involving proteins that have been screened multiple times in purification experiments. It also excels in the analysis of recent protein purification screens of molecular chaperones and protein kinases. In contrast to previous findings, we observe that physical contacts inferred from purification experiments of protein complexes can be qualitatively comparable to binary protein interactions measured by experimental high-throughput assays such as yeast two-hybrid. This suggests that computationally derived physical contacts might complement binary protein interaction assays and guide large-scale interactome mapping projects by prioritizing putative physical contacts for further experimental screens.
近年来,大规模的蛋白质复合物纯化数据集为细胞蛋白质复合物的组织提供了前所未有的见解。已经开发了几种计算方法来检测这些数据集中共纯化的蛋白质。它们的共同目标是识别生物学上相关的蛋白质复合物。然而,对于在检测到的蛋白质复合物内直接物理蛋白质接触的网络,人们知之甚少。因此,我们的工作研究了是否可以通过组合大规模蛋白质复合物纯化的原始数据来计算得出直接的物理接触。我们评估了四种已建立的评分方案,并引入了一种新的评分方法,该方法专门用于从蛋白质复合物纯化中推断直接物理蛋白质接触。这五种方法识别的物理接触与提供真实物理接触证据的不同参考集进行了全面的基准测试。我们的结果表明,原始纯化数据确实可以被利用来确定蛋白质复合物内高可信度的物理蛋白质接触。特别是,我们的新方法在发现涉及在纯化实验中多次筛选的蛋白质的物理接触方面优于竞争方法。它在分析最近的分子伴侣和蛋白激酶的蛋白质纯化筛选中也表现出色。与之前的发现相反,我们观察到从蛋白质复合物纯化实验推断出的物理接触可以与通过实验高通量测定(如酵母双杂交)测量的二进制蛋白质相互作用定性地进行比较。这表明计算得出的物理接触可能会补充二进制蛋白质相互作用测定,并通过优先考虑用于进一步实验筛选的假定物理接触来指导大规模相互作用组映射项目。