IEEE Trans Neural Netw Learn Syst. 2023 May;34(5):2466-2479. doi: 10.1109/TNNLS.2021.3106702. Epub 2023 May 2.
Unsupervised dimension reduction and clustering are frequently used as two separate steps to conduct clustering tasks in subspace. However, the two-step clustering methods may not necessarily reflect the cluster structure in the subspace. In addition, the existing subspace clustering methods do not consider the relationship between the low-dimensional representation and local structure in the input space. To address the above issues, we propose a robust discriminant subspace (RDS) clustering model with adaptive local structure embedding. Specifically, unlike the existing methods which incorporate dimension reduction and clustering via regularizer, thereby introducing extra parameters, RDS first integrates them into a unified matrix factorization (MF) model through theoretical proof. Furthermore, a similarity graph is constructed to learn the local structure. A constraint is imposed on the graph to guarantee that it has the same connected components with low-dimensional representation. In this spirit, the similarity graph serves as a tradeoff that adaptively balances the learning process between the low-dimensional space and the original space. Finally, RDS adopts the l -norm to measure the residual error, which enhances the robustness to noise. Using the property of the l -norm, RDS can be optimized efficiently without introducing more penalty terms. Experimental results on real-world benchmark datasets show that RDS can provide more interpretable clustering results and also outperform other state-of-the-art alternatives.
无监督降维和聚类经常被用作在子空间中进行聚类任务的两个独立步骤。然而,两步聚类方法不一定能反映子空间中的聚类结构。此外,现有的子空间聚类方法没有考虑低维表示和输入空间中局部结构之间的关系。为了解决上述问题,我们提出了一种具有自适应局部结构嵌入的鲁棒判别子空间 (RDS) 聚类模型。具体来说,与现有的通过正则化将降维和聚类结合起来从而引入额外参数的方法不同,RDS 首先通过理论证明将它们集成到一个统一的矩阵分解 (MF) 模型中。此外,构建了一个相似性图来学习局部结构。对图施加约束,以确保它与低维表示具有相同的连通分量。按照这种精神,相似性图作为一种折衷方案,自适应地平衡低维空间和原始空间的学习过程。最后,RDS 采用 l -范数来测量残差,从而提高了对噪声的鲁棒性。利用 l -范数的性质,RDS 可以在不引入更多惩罚项的情况下高效优化。在真实基准数据集上的实验结果表明,RDS 可以提供更具可解释性的聚类结果,并且优于其他最先进的替代方案。