Wang Jingyu, Ma Zhenyu, Nie Feiping, Li Xuelong
IEEE Trans Neural Netw Learn Syst. 2024 Oct;35(10):15012-15020. doi: 10.1109/TNNLS.2023.3279380. Epub 2024 Oct 7.
Spectral clustering (SC) has been applied to analyze varieties of data structures over the past few decades owing to its outstanding breakthrough in graph learning. However, the time-consuming eigenvalue decomposition (EVD) and information loss during relaxation and discretization impact the efficiency and accuracy especially for large-scale data. To address above issues, this brief proposes a simple and fast method named efficient discrete clustering with anchor graph (EDCAG) to circumvent postprocessing by binary label optimization. First of all, sparse anchors are adopted to accelerate graph construction and obtain a parameter-free anchor similarity matrix. Subsequently, inspired by intraclass similarity maximization in SC, we design an intraclass similarity maximization model between anchor-sample layer to cope with anchor graph cut problem and exploit more explicit data structures. Meanwhile, a fast coordinate rising (CR) algorithm is employed to alternatively optimize discrete labels of samples and anchors in designed model. Experimental results show excellent rapidity and competitive clustering effect of EDCAG.
在过去几十年中,由于谱聚类(SC)在图学习方面取得了杰出突破,它已被应用于分析各种数据结构。然而,耗时的特征值分解(EVD)以及松弛和离散化过程中的信息损失影响了效率和准确性,尤其是对于大规模数据而言。为了解决上述问题,本简报提出了一种名为带锚图的高效离散聚类(EDCAG)的简单快速方法,以通过二元标签优化规避后处理。首先,采用稀疏锚点来加速图构建并获得无参数的锚点相似性矩阵。随后,受谱聚类中类内相似性最大化的启发,我们在锚点 - 样本层之间设计了一个类内相似性最大化模型,以解决锚点图切割问题并挖掘更明确的数据结构。同时,采用快速坐标上升(CR)算法交替优化所设计模型中样本和锚点的离散标签。实验结果表明,EDCAG具有出色的快速性和有竞争力的聚类效果。