Institute for High Performance Computing and Networking-ICAR, National Research Council of Italy-CNR, Via P. Bucci 41C, 87036 Rende-CS, Italy.
IEEE/ACM Trans Comput Biol Bioinform. 2012 May-Jun;9(3):717-30. doi: 10.1109/TCBB.2011.158.
Several approaches have been presented in the literature to cluster Protein-Protein Interaction (PPI) networks. They can be grouped in two main categories: those allowing a protein to participate in different clusters and those generating only nonoverlapping clusters. In both cases, a challenging task is to find a suitable compromise between the biological relevance of the results and a comprehensive coverage of the analyzed networks. Indeed, methods returning high accurate results are often able to cover only small parts of the input PPI network, especially when low-characterized networks are considered. We present a coclustering-based technique able to generate both overlapping and nonoverlapping clusters. The density of the clusters to search for can also be set by the user. We tested our method on the two networks of yeast and human, and compared it to other five well-known techniques on the same interaction data sets. The results showed that, for all the examples considered, our approach always reaches a good compromise between accuracy and network coverage. Furthermore, the behavior of our algorithm is not influenced by the structure of the input network, different from all the techniques considered in the comparison, which returned very good results on the yeast network, while on the human network their outcomes are rather poor.
文献中提出了几种聚类蛋白质-蛋白质相互作用(PPI)网络的方法。它们可以分为两大类:允许一个蛋白质参与不同簇的方法和生成非重叠簇的方法。在这两种情况下,一个具有挑战性的任务是在结果的生物学相关性和对分析网络的全面覆盖之间找到一个合适的折衷。事实上,返回高精度结果的方法通常只能覆盖输入 PPI 网络的一小部分,特别是在考虑低特征网络时。我们提出了一种基于共聚类的技术,能够生成重叠和非重叠的簇。用户还可以设置要搜索的簇的密度。我们在酵母和人类的两个网络上测试了我们的方法,并将其与同一交互数据集上的其他五种著名技术进行了比较。结果表明,对于所有考虑的示例,我们的方法始终在准确性和网络覆盖之间达到良好的折衷。此外,我们算法的行为不受输入网络结构的影响,与比较中考虑的所有技术都不同,这些技术在酵母网络上取得了很好的结果,而在人类网络上它们的结果则相当差。