Department of Computer Science, School of Information Science and Engineering, Central South University, Changsha 410083, China.
IEEE/ACM Trans Comput Biol Bioinform. 2011 May-Jun;8(3):607-20. doi: 10.1109/TCBB.2010.75.
As advances in the technologies of predicting protein interactions, huge data sets portrayed as networks have been available. Identification of functional modules from such networks is crucial for understanding principles of cellular organization and functions. However, protein interaction data produced by high-throughput experiments are generally associated with high false positives, which makes it difficult to identify functional modules accurately. In this paper, we propose a fast hierarchical clustering algorithm HC-PIN based on the local metric of edge clustering value which can be used both in the unweighted network and in the weighted network. The proposed algorithm HC-PIN is applied to the yeast protein interaction network, and the identified modules are validated by all the three types of Gene Ontology (GO) Terms: Biological Process, Molecular Function, and Cellular Component. The experimental results show that HC-PIN is not only robust to false positives, but also can discover the functional modules with low density. The identified modules are statistically significant in terms of three types of GO annotations. Moreover, HC-PIN can uncover the hierarchical organization of functional modules with the variation of its parameter's value, which is approximatively corresponding to the hierarchical structure of GO annotations. Compared to other previous competing algorithms, our algorithm HC-PIN is faster and more accurate.
随着预测蛋白质相互作用技术的进步,大量以网络形式呈现的数据集已经可用。从这些网络中识别功能模块对于理解细胞组织和功能的原理至关重要。然而,高通量实验产生的蛋白质相互作用数据通常伴随着高假阳性率,这使得准确识别功能模块变得困难。在本文中,我们提出了一种基于边聚类值局部度量的快速层次聚类算法 HC-PIN,该算法可用于无权重网络和权重网络。所提出的算法 HC-PIN 应用于酵母蛋白质相互作用网络,通过三种类型的基因本体 (GO) 术语(生物过程、分子功能和细胞成分)对识别出的模块进行验证。实验结果表明,HC-PIN 不仅对假阳性具有鲁棒性,而且可以发现具有低密度的功能模块。所识别的模块在三种 GO 注释方面具有统计学意义。此外,HC-PIN 可以随着其参数值的变化揭示功能模块的层次组织,这与 GO 注释的层次结构大致对应。与其他先前的竞争算法相比,我们的算法 HC-PIN 更快、更准确。