School of Computer Engineering, Nanyang Technological University, Singapore, Singapore.
PLoS One. 2013;8(2):e53197. doi: 10.1371/journal.pone.0053197. Epub 2013 Feb 6.
Protein complexes are key entities to perform cellular functions. Human diseases are also revealed to associate with some specific human protein complexes. In fact, human protein complexes are widely used for protein function annotation, inference of human protein interactome, disease gene prediction, and so on. Therefore, it is highly desired to build an up-to-date catalogue of human complexes to support the research in these applications. Protein complexes from different databases are as expected to be highly redundant. In this paper, we designed a set of concise operations to compile these redundant human complexes and built a comprehensive catalogue called CHPC2012 (Catalogue of Human Protein Complexes). CHPC2012 achieves a higher coverage for proteins and protein complexes than those individual databases. It is also verified to be a set of complexes with high quality as its co-complex protein associations have a high overlap with protein-protein interactions (PPI) in various existing PPI databases. We demonstrated two distinct applications of CHPC2012, that is, investigating the relationship between protein complexes and drug-related systems and evaluating the quality of predicted protein complexes. In particular, CHPC2012 provides more insights into drug development. For instance, proteins involved in multiple complexes (the overlapping proteins) are potential drug targets; the drug-complex network is utilized to investigate multi-target drugs and drug-drug interactions; and the disease-specific complex-drug networks will provide new clues for drug repositioning. With this up-to-date reference set of human protein complexes, we believe that the CHPC2012 catalogue is able to enhance the studies for protein interactions, protein functions, human diseases, drugs, and related fields of research. CHPC2012 complexes can be downloaded from http://www1.i2r.a-star.edu.sg/xlli/CHPC2012/CHPC2012.htm.
蛋白质复合物是执行细胞功能的关键实体。人类疾病也被揭示与一些特定的人类蛋白质复合物有关。事实上,人类蛋白质复合物广泛用于蛋白质功能注释、人类蛋白质互作网络的推断、疾病基因预测等。因此,构建一个最新的人类复合物目录来支持这些应用的研究是非常需要的。来自不同数据库的蛋白质复合物预计会高度冗余。在本文中,我们设计了一组简洁的操作来编译这些冗余的人类复合物,并构建了一个名为 CHPC2012(人类蛋白质复合物目录)的综合目录。CHPC2012 实现了比单个数据库更高的蛋白质和蛋白质复合物覆盖率。它还被验证为一组具有高质量的复合物,因为其共复合物蛋白质关联与各种现有的蛋白质-蛋白质相互作用 (PPI) 数据库中的蛋白质-蛋白质相互作用有很高的重叠。我们展示了 CHPC2012 的两个不同应用,即研究蛋白质复合物与药物相关系统的关系和评估预测蛋白质复合物的质量。特别是,CHPC2012 为药物开发提供了更多的见解。例如,涉及多个复合物(重叠蛋白)的蛋白质是潜在的药物靶点;利用药物-复合物网络来研究多靶点药物和药物-药物相互作用;以及特定疾病的复合物-药物网络将为药物重新定位提供新的线索。有了这个最新的人类蛋白质复合物参考集,我们相信 CHPC2012 目录能够增强蛋白质相互作用、蛋白质功能、人类疾病、药物和相关研究领域的研究。CHPC2012 复合物可从 http://www1.i2r.a-star.edu.sg/xlli/CHPC2012/CHPC2012.htm 下载。