Sun Peng, Guo Jiong, Baumbach Jan
BMC Proc. 2013 Dec 20;7(Suppl 7):S9. doi: 10.1186/1753-6561-7-S7-S9.
The explosion of biological data has dramatically reformed today's biology research. The biggest challenge to biologists and bioinformaticians is the integration and analysis of large quantity of data to provide meaningful insights. One major problem is the combined analysis of data from different types. Bi-cluster editing, as a special case of clustering, which partitions two different types of data simultaneously, might be used for several biomedical scenarios. However, the underlying algorithmic problem is NP-hard.
Here we contribute with BiCluE, a software package designed to solve the weighted bi-cluster editing problem. It implements (1) an exact algorithm based on fixed-parameter tractability and (2) a polynomial-time greedy heuristics based on solving the hardest part, edge deletions, first. We evaluated its performance on artificial graphs. Afterwards we exemplarily applied our implementation on real world biomedical data, GWAS data in this case. BiCluE generally works on any kind of data types that can be modeled as (weighted or unweighted) bipartite graphs.
To our knowledge, this is the first software package solving the weighted bi-cluster editing problem. BiCluE as well as the supplementary results are available online at http://biclue.mpi-inf.mpg.de.
生物数据的爆炸式增长极大地改变了当今的生物学研究。生物学家和生物信息学家面临的最大挑战是整合和分析大量数据以提供有意义的见解。一个主要问题是对来自不同类型的数据进行联合分析。双聚类编辑作为聚类的一种特殊情况,可同时对两种不同类型的数据进行划分,可能适用于多种生物医学场景。然而,其潜在的算法问题是NP难的。
在此,我们推出了BiCluE软件包,旨在解决加权双聚类编辑问题。它实现了:(1)一种基于固定参数可解性的精确算法;(2)一种基于首先解决最难部分(边删除)的多项式时间贪婪启发式算法。我们在人工图上评估了其性能。之后,我们以实际生物医学数据(在此为全基因组关联研究(GWAS)数据)为例应用了我们的实现。BiCluE通常适用于任何可建模为(加权或未加权)二分图的各种数据类型。
据我们所知,这是第一个解决加权双聚类编辑问题的软件包。BiCluE以及补充结果可在http://biclue.mpi-inf.mpg.de在线获取。