Department of Computer Science and Engineering, Netaji Subhash Engineering College, Kolkata 700152, India.
Bioinformatics. 2009 Nov 1;25(21):2795-801. doi: 10.1093/bioinformatics/btp526. Epub 2009 Sep 3.
Biclustering has been emerged as a powerful tool for identification of a group of co-expressed genes under a subset of experimental conditions (measurements) present in a gene expression dataset. Several biclustering algorithms have been proposed till date. In this article, we address some of the important shortcomings of these existing biclustering algorithms and propose a new correlation-based biclustering algorithm called bi-correlation clustering algorithm (BCCA).
BCCA has been able to produce a diverse set of biclusters of co-regulated genes over a subset of samples where all the genes in a bicluster have a similar change of expression pattern over the subset of samples. Moreover, the genes in a bicluster have common transcription factor binding sites in the corresponding promoter sequences. The presence of common transcription factors binding sites, in the corresponding promoter sequences, is an evidence that a group of genes in a bicluster are co-regulated. Biclusters determined by BCCA also show highly enriched functional categories. Using different gene expression datasets, we demonstrate strength and superiority of BCCA over some existing biclustering algorithms.
The software for BCCA has been developed using C and Visual Basic languages, and can be executed on the Microsoft Windows platforms. The software may be downloaded as a zip file from http://www.isical.ac.in/ approximately rajat. Then it needs to be installed. Two word files (included in the zip file) need to be consulted before installation and execution of the software.
Supplementary data are available at Bioinformatics online.
分块聚类已成为识别基因表达数据集在一组实验条件(测量)下共同表达基因的强大工具。迄今为止已经提出了几种分块聚类算法。在本文中,我们解决了这些现有分块聚类算法的一些重要缺点,并提出了一种新的基于相关性的分块聚类算法,称为双相关聚类算法(BCCA)。
BCCA 能够在所有基因在分块中具有相似的子集上的表达模式变化的子集上产生一组具有相似表达模式变化的共调控基因的多样分组块。此外,分块中的基因在相应启动子序列中有共同的转录因子结合位点。在相应的启动子序列中存在共同的转录因子结合位点是一组基因在分块中受到共同调控的证据。BCCA 确定的分块还显示出高度丰富的功能类别。使用不同的基因表达数据集,我们证明了 BCCA 相对于一些现有分块聚类算法的优势和优越性。
BCCA 的软件是使用 C 和 Visual Basic 语言开发的,可以在 Microsoft Windows 平台上执行。该软件可以从 http://www.isical.ac.in/ 大约 rajat 下载为 zip 文件。然后需要安装。在安装和执行软件之前,需要查阅包含在 zip 文件中的两个 word 文件。
补充数据可在生物信息学在线获得。