IEEE/ACM Trans Comput Biol Bioinform. 2018 Sep-Oct;15(5):1500-1512. doi: 10.1109/TCBB.2018.2834371. Epub 2018 May 11.
Tumor clustering is a powerful approach for cancer class discovery which is crucial to the effective treatment of cancer. Many traditional clustering methods such as NMF-based models, have been widely used to identify tumors. However, they cannot achieve satisfactory results. Recently, subspace clustering approaches have been proposed to improve the performance by dividing the original space into multiple low-dimensional subspaces. Among them, low rank representation is becoming a popular approach to attain subspace clustering. In this paper, we propose a novel Low Rank Subspace Clustering model via Discrete Constraint and Hypergraph Regularization (DHLRS). The proposed method learns the cluster indicators directly by using discrete constraint, which makes the clustering task simple. For each subspace, we adopt Schatten -norm to better approximate the low rank constraint. Moreover, Hypergraph Regularization is adopted to infer the complex relationship between genes and intrinsic geometrical structure of gene expression data in each subspace. Finally, the molecular pattern of tumor gene expression data sets is discovered according to the optimized cluster indicators. Experiments on both synthetic data and real tumor gene expression data sets prove the effectiveness of proposed DHLRS.
肿瘤聚类是癌症分类发现的一种强大方法,对于癌症的有效治疗至关重要。许多传统的聚类方法,如基于 NMF 的模型,已被广泛用于识别肿瘤。然而,它们不能达到令人满意的效果。最近,子空间聚类方法已经被提出,通过将原始空间划分为多个低维子空间来提高性能。其中,低秩表示正在成为实现子空间聚类的一种流行方法。在本文中,我们提出了一种新颖的基于离散约束和超图正则化的低秩子空间聚类模型(DHLRS)。所提出的方法通过使用离散约束直接学习聚类指标,从而使聚类任务变得简单。对于每个子空间,我们采用 Schatten 范数来更好地逼近低秩约束。此外,超图正则化用于推断每个子空间中基因之间的复杂关系和基因表达数据的内在几何结构。最后,根据优化后的聚类指标发现肿瘤基因表达数据集的分子模式。在合成数据和真实肿瘤基因表达数据集上的实验证明了所提出的 DHLRS 的有效性。