Jung Suk Hoon, Jang Woo-Hyuk, Hur Hee-Yung, Hyun Bora, Han Dong-Soo
School of Engineering, Information and Communications University, Yuseong-gu, Daejeon, Korea.
Genome Inform. 2008;21:77-88.
The increasing amount of available Protein-Protein Interaction (PPI) data enables scalable methods for the protein complex prediction. A protein complex is a group of two or more proteins formed by interactions that are stable over time, and it generally corresponds to a dense sub-graph in PPI Network (PPIN). However, dense sub-graphs correspond not only to stable protein complexes but also to sets of proteins including dynamic interactions. As a result, conventional simple PPIN based graph-theoretic clustering methods have high false positive rates in protein complex prediction. In this paper, we propose an approach to predict protein complexes based on the integration of PPI data and mutually exclusive interaction information drawn from structural interface data of protein domains. The extraction of Simultaneous Protein Interaction Cluster (SPIC) is the essence of our approach, which excludes interaction conflicts in network clusters by achieving mutually exclusion among interactions. The concept of SPIC was applied to conventional graph-theoretic clustering algorithms, MCODE and LCMA, to evaluate the density of clusters for protein complex prediction. The comparison with original graph-theoretic clustering algorithms verified the effectiveness of our approach; SPIC based methods refined false positives of original methods to be true positive complexes, without any loss of true positive predictions yielded by original methods.
可用的蛋白质-蛋白质相互作用(PPI)数据量不断增加,这使得用于蛋白质复合物预测的可扩展方法成为可能。蛋白质复合物是由随着时间推移稳定的相互作用形成的两个或更多蛋白质的组合,并且它通常对应于PPI网络(PPIN)中的一个密集子图。然而,密集子图不仅对应于稳定的蛋白质复合物,还对应于包括动态相互作用的蛋白质集合。因此,传统的基于简单PPIN的图论聚类方法在蛋白质复合物预测中具有较高的假阳性率。在本文中,我们提出了一种基于PPI数据与从蛋白质结构域的结构界面数据中提取的互斥相互作用信息相结合来预测蛋白质复合物的方法。同时蛋白质相互作用簇(SPIC)的提取是我们方法的核心,它通过实现相互作用之间的互斥来排除网络簇中的相互作用冲突。SPIC的概念被应用于传统的图论聚类算法MCODE和LCMA,以评估用于蛋白质复合物预测的簇的密度。与原始图论聚类算法的比较验证了我们方法的有效性;基于SPIC的方法将原始方法的假阳性细化为真正的阳性复合物,而不会损失原始方法产生的真阳性预测。