Jothi R, Mohanty Sraban Kumar, Ojha Aparajita
Indian Institute of Information Technology, Design and Manufacturing Jabalpur, Madhya Pradesh, India.
Comput Biol Med. 2016 Apr 1;71:135-48. doi: 10.1016/j.compbiomed.2016.02.007. Epub 2016 Feb 21.
Gene expression data clustering is an important biological process in DNA microarray analysis. Although there have been many clustering algorithms for gene expression analysis, finding a suitable and effective clustering algorithm is always a challenging problem due to the heterogeneous nature of gene profiles. Minimum Spanning Tree (MST) based clustering algorithms have been successfully employed to detect clusters of varying shapes and sizes. This paper proposes a novel clustering algorithm using Eigenanalysis on Minimum Spanning Tree based neighborhood graph (E-MST). As MST of a set of points reflects the similarity of the points with their neighborhood, the proposed algorithm employs a similarity graph obtained from k(') rounds of MST (k(')-MST neighborhood graph). By studying the spectral properties of the similarity matrix obtained from k(')-MST graph, the proposed algorithm achieves improved clustering results. We demonstrate the efficacy of the proposed algorithm on 12 gene expression datasets. Experimental results show that the proposed algorithm performs better than the standard clustering algorithms.
基因表达数据聚类是DNA微阵列分析中的一个重要生物学过程。尽管已经有许多用于基因表达分析的聚类算法,但由于基因图谱的异质性,找到一种合适且有效的聚类算法始终是一个具有挑战性的问题。基于最小生成树(MST)的聚类算法已成功用于检测各种形状和大小的簇。本文提出了一种基于最小生成树邻域图的特征分析的新型聚类算法(E-MST)。由于一组点的MST反映了这些点与其邻域的相似性,因此该算法采用从k(')轮MST获得的相似性图(k(')-MST邻域图)。通过研究从k(')-MST图获得的相似性矩阵的谱特性,该算法取得了改进的聚类结果。我们在12个基因表达数据集上证明了该算法的有效性。实验结果表明,该算法的性能优于标准聚类算法。