School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China.
Comput Intell Neurosci. 2021 Dec 8;2021:9990297. doi: 10.1155/2021/9990297. eCollection 2021.
Clustering of tumor samples can help identify cancer types and discover new cancer subtypes, which is essential for effective cancer treatment. Although many traditional clustering methods have been proposed for tumor sample clustering, advanced algorithms with better performance are still needed. Low-rank subspace clustering is a popular algorithm in recent years. In this paper, we propose a novel one-step robust low-rank subspace segmentation method (ORLRS) for clustering the tumor sample. For a gene expression data set, we seek its lowest rank representation matrix and the noise matrix. By imposing the discrete constraint on the low-rank matrix, without performing spectral clustering, ORLRS learns the cluster indicators of subspaces directly, i.e., performing the clustering task in one step. To improve the robustness of the method, capped norm is adopted to remove the extreme data outliers in the noise matrix. Furthermore, we conduct an efficient solution to solve the problem of ORLRS. Experiments on several tumor gene expression data demonstrate the effectiveness of ORLRS.
肿瘤样本聚类有助于识别癌症类型并发现新的癌症亚型,这对于有效的癌症治疗至关重要。尽管已经提出了许多传统的肿瘤样本聚类方法,但仍需要具有更好性能的先进算法。低秩子空间聚类是近年来流行的算法。在本文中,我们提出了一种新颖的一步鲁棒低秩子空间分割方法(ORLRS),用于聚类肿瘤样本。对于基因表达数据集,我们寻求其最低秩表示矩阵和噪声矩阵。通过对低秩矩阵施加离散约束,无需进行谱聚类,ORLRS 直接学习子空间的聚类指标,即在一步内执行聚类任务。为了提高方法的鲁棒性,采用上限范数去除噪声矩阵中的极端数据异常值。此外,我们还提出了一种有效的方法来解决 ORLRS 的问题。在几个肿瘤基因表达数据上的实验表明了 ORLRS 的有效性。