Suppr超能文献

应用于基因相互作用网络的网络聚类算法的比较与评估

Comparison and evaluation of network clustering algorithms applied to genetic interaction networks.

作者信息

Hou Lin, Wang Lin, Berg Arthur, Qian Minping, Zhu Yunping, Li Fangting, Deng Minghua

机构信息

LMAM, School of Mathematical Sciences, Peking University, Beijing 100871, China.

出版信息

Front Biosci (Elite Ed). 2012 Jan 1;4(6):2150-61. doi: 10.2741/e532.

Abstract

The goal of network clustering algorithms detect dense clusters in a network, and provide a first step towards the understanding of large scale biological networks. With numerous recent advances in biotechnologies, large-scale genetic interactions are widely available, but there is a limited understanding of which clustering algorithms may be most effective. In order to address this problem, we conducted a systematic study to compare and evaluate six clustering algorithms in analyzing genetic interaction networks, and investigated influencing factors in choosing algorithms. The algorithms considered in this comparison include hierarchical clustering, topological overlap matrix, bi-clustering, Markov clustering, Bayesian discriminant analysis based community detection, and variational Bayes approach to modularity. Both experimentally identified and synthetically constructed networks were used in this comparison. The accuracy of the algorithms is measured by the Jaccard index in comparing predicted gene modules with benchmark gene sets. The results suggest that the choice differs according to the network topology and evaluation criteria. Hierarchical clustering showed to be best at predicting protein complexes; Bayesian discriminant analysis based community detection proved best under epistatic miniarray profile (EMAP) datasets; the variational Bayes approach to modularity was noticeably better than the other algorithms in the genome-scale networks.

摘要

网络聚类算法的目标是在网络中检测密集簇,并为理解大规模生物网络迈出第一步。随着生物技术最近取得众多进展,大规模遗传相互作用广泛可得,但对于哪种聚类算法可能最有效,人们的了解有限。为了解决这个问题,我们进行了一项系统研究,以比较和评估六种聚类算法在分析遗传相互作用网络方面的表现,并研究选择算法时的影响因素。此次比较中考虑的算法包括层次聚类、拓扑重叠矩阵、双聚类、马尔可夫聚类、基于贝叶斯判别分析的社区检测以及变分贝叶斯模块化方法。此次比较使用了实验鉴定的网络和人工构建的网络。在将预测的基因模块与基准基因集进行比较时,算法的准确性通过杰卡德指数来衡量。结果表明,根据网络拓扑结构和评估标准的不同,选择也会有所不同。层次聚类在预测蛋白质复合物方面表现最佳;基于贝叶斯判别分析的社区检测在上位性微阵列谱(EMAP)数据集下被证明是最佳的;变分贝叶斯模块化方法在基因组规模网络中明显优于其他算法。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验