Department of Computer Science, Cochin University of Science and Technology, Kochin, Kerala, India.
Adv Exp Med Biol. 2011;696:123-34. doi: 10.1007/978-1-4419-7046-6_13.
The goal of biclustering in gene expression data matrix is to find a submatrix such that the genes in the submatrix show highly correlated activities across all conditions in the submatrix. A measure called mean squared residue (MSR) is used to simultaneously evaluate the coherence of rows and columns within the submatrix. MSR difference is the incremental increase in MSR when a gene or condition is added to the bicluster. In this chapter, three biclustering algorithms using MSR threshold (MSRT) and MSR difference threshold (MSRDT) are experimented and compared. All these methods use seeds generated from K-Means clustering algorithm. Then these seeds are enlarged by adding more genes and conditions. The first algorithm makes use of MSRT alone. Both the second and third algorithms make use of MSRT and the newly introduced concept of MSRDT. Highly coherent biclusters are obtained using this concept. In the third algorithm, a different method is used to calculate the MSRDT. The results obtained on bench mark datasets prove that these algorithms are better than many of the metaheuristic algorithms.
基因表达数据矩阵中的双聚类的目标是找到一个子矩阵,使得子矩阵中的基因在子矩阵的所有条件下表现出高度相关的活性。一种称为均方残差(MSR)的度量标准用于同时评估子矩阵内的行和列的一致性。当向双聚类中添加一个基因或条件时,MSR 差异是 MSR 的增量增加。在本章中,实验并比较了三种使用均方残差阈值(MSRT)和均方残差差异阈值(MSRDT)的双聚类算法。所有这些方法都使用 K-Means 聚类算法生成的种子。然后通过添加更多的基因和条件来扩大这些种子。第一个算法仅使用 MSRT。第二个和第三个算法都使用 MSRT 和新引入的 MSRDT 概念。使用这个概念获得了高度一致的双聚类。在第三个算法中,使用了一种不同的方法来计算 MSRDT。在基准数据集上获得的结果证明,这些算法优于许多元启发式算法。