Joint IRB-BSC Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona) , Barcelona , Spain.
Joint IRB-BSC Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona) , Barcelona , Spain ; Institució Catalana de Recerca i Estudis Avançats (ICREA) , Barcelona , Spain.
PeerJ. 2014 May 29;2:e413. doi: 10.7717/peerj.413. eCollection 2014.
Macromolecular assemblies play an important role in almost all cellular processes. However, despite several large-scale studies, our current knowledge about protein complexes is still quite limited, thus advocating the use of in silico predictions to gather information on complex composition in model organisms. Since protein-protein interactions present certain constraints on the functional divergence of macromolecular assemblies during evolution, it is possible to predict complexes based on orthology data. Here, we show that incorporating interaction information through network alignment significantly increases the precision of orthology-based complex prediction. Moreover, we performed a large-scale in silico screen for protein complexes in human, yeast and fly, through the alignment of hundreds of known complexes to whole organism interactomes. Systematic comparison of the resulting network alignments to all complexes currently known in those species revealed many conserved complexes, as well as several novel complex components. In addition to validating our predictions using orthogonal data, we were able to assign specific functional roles to the predicted complexes. In several cases, the incorporation of interaction data through network alignment allowed to distinguish real complex components from other orthologous proteins. Our analyses indicate that current knowledge of yeast protein complexes exceeds that in other organisms and that predicting complexes in fly based on human and yeast data is complementary rather than redundant. Lastly, assessing the conservation of protein complexes of the human pathogen Mycoplasma pneumoniae, we discovered that its complexes repertoire is different from that of eukaryotes, suggesting new points of therapeutic intervention, whereas targeting the pathogen's Restriction enzyme complex might lead to adverse effects due to its similarity to ATP-dependent metalloproteases in the human host.
大分子组装在几乎所有细胞过程中都起着重要作用。然而,尽管进行了几次大规模的研究,但我们目前对蛋白质复合物的了解仍然相当有限,因此提倡使用计算机预测来收集模型生物中复合物组成的信息。由于蛋白质-蛋白质相互作用对大分子组装在进化过程中的功能分化有一定的限制,因此可以根据同源性数据来预测复合物。在这里,我们表明通过网络对齐纳入相互作用信息可以显著提高基于同源性的复合物预测的精度。此外,我们通过将数百个已知复合物与整个生物体相互作用组进行对齐,在人类、酵母和苍蝇中进行了大规模的计算机筛选蛋白质复合物。对由此产生的网络对齐与这些物种中目前已知的所有复合物的系统比较揭示了许多保守的复合物,以及几个新的复合物成分。除了使用正交数据验证我们的预测外,我们还能够为预测的复合物分配特定的功能角色。在某些情况下,通过网络对齐纳入相互作用数据可以将真实的复合物成分与其他同源蛋白区分开来。我们的分析表明,目前对酵母蛋白质复合物的了解超过了其他生物体,并且基于人类和酵母数据预测苍蝇中的复合物是互补的,而不是冗余的。最后,评估人类病原体肺炎支原体的蛋白质复合物的保守性,我们发现其复合物谱与真核生物不同,这表明了新的治疗干预点,而针对病原体的限制酶复合物可能会由于其与人类宿主中 ATP 依赖性金属蛋白酶的相似性而产生不良影响。