Bai Liang, Liang Jiye, Zhao Yunxiao
IEEE Trans Pattern Anal Mach Intell. 2023 Apr;45(4):5126-5138. doi: 10.1109/TPAMI.2022.3188160. Epub 2023 Mar 7.
As a leading graph clustering technique, spectral clustering is one of the most widely used clustering methods to capture complex clusters in data. Some additional prior information can help it to further reduce the difference between its clustering results and users' expectations. However, it is hard to get the prior information under unsupervised scene to guide the clustering process. To solve this problem, we propose a self-constrained spectral clustering algorithm. In this algorithm, we extend the objective function of spectral clustering by adding pairwise and label self-constrained terms to it. We provide the theoretical analysis to show the roles of the self-constrained terms and the extensibility of the proposed algorithm. Based on the new objective function, we build an optimization model for self-constrained spectral clustering so that we can simultaneously learn the clustering results and constraints. Furthermore, we propose an iterative method to solve the new optimization problem. Compared to other existing versions of spectral clustering algorithms, the new algorithm can discover a high-quality cluster structure of a data set without prior information. Extensive experiments on benchmark data sets illustrate the effectiveness of the proposed algorithm.
作为一种领先的图聚类技术,谱聚类是用于捕捉数据中复杂聚类的最广泛使用的聚类方法之一。一些额外的先验信息可以帮助它进一步缩小聚类结果与用户期望之间的差异。然而,在无监督场景下很难获取先验信息来指导聚类过程。为了解决这个问题,我们提出了一种自约束谱聚类算法。在该算法中,我们通过添加成对约束项和标签自约束项来扩展谱聚类的目标函数。我们提供了理论分析以展示自约束项的作用以及所提算法的可扩展性。基于新的目标函数,我们构建了自约束谱聚类的优化模型,以便能够同时学习聚类结果和约束。此外,我们提出了一种迭代方法来解决新的优化问题。与其他现有的谱聚类算法版本相比,新算法无需先验信息就能发现数据集的高质量聚类结构。在基准数据集上进行的大量实验说明了所提算法的有效性。