Peng Zhihao, Liu Hui, Jia Yuheng, Hou Junhui
IEEE Trans Image Process. 2023;32:6457-6468. doi: 10.1109/TIP.2023.3333557. Epub 2023 Dec 1.
Existing graph clustering networks heavily rely on a predefined yet fixed graph, which can lead to failures when the initial graph fails to accurately capture the data topology structure of the embedding space. In order to address this issue, we propose a novel clustering network called Embedding-Induced Graph Refinement Clustering Network (EGRC-Net), which effectively utilizes the learned embedding to adaptively refine the initial graph and enhance the clustering performance. To begin, we leverage both semantic and topological information by employing a vanilla auto-encoder and a graph convolution network, respectively, to learn a latent feature representation. Subsequently, we utilize the local geometric structure within the feature embedding space to construct an adjacency matrix for the graph. This adjacency matrix is dynamically fused with the initial one using our proposed fusion architecture. To train the network in an unsupervised manner, we minimize the Jeffreys divergence between multiple derived distributions. Additionally, we introduce an improved approximate personalized propagation of neural predictions to replace the standard graph convolution network, enabling EGRC-Net to scale effectively. Through extensive experiments conducted on nine widely-used benchmark datasets, we demonstrate that our proposed methods consistently outperform several state-of-the-art approaches. Notably, EGRC-Net achieves an improvement of more than 11.99% in Adjusted Rand Index (ARI) over the best baseline on the DBLP dataset. Furthermore, our scalable approach exhibits a 10.73% gain in ARI while reducing memory usage by 33.73% and decreasing running time by 19.71%. The code for EGRC-Net will be made publicly available at https://github.com/ZhihaoPENG-CityU/EGRC-Net.
现有的图聚类网络严重依赖预定义的固定图,当初始图无法准确捕捉嵌入空间的数据拓扑结构时,这可能导致失败。为了解决这个问题,我们提出了一种新颖的聚类网络,称为嵌入诱导图细化聚类网络(EGRC-Net),它有效地利用学习到的嵌入来自适应地细化初始图并提高聚类性能。首先,我们分别通过使用一个普通自动编码器和一个图卷积网络来利用语义和拓扑信息,以学习潜在特征表示。随后,我们利用特征嵌入空间内的局部几何结构为图构建邻接矩阵。使用我们提出的融合架构将这个邻接矩阵与初始邻接矩阵动态融合。为了以无监督方式训练网络,我们最小化多个派生分布之间的杰弗里斯散度。此外,我们引入了一种改进的近似神经预测个性化传播来取代标准图卷积网络,使EGRC-Net能够有效地扩展。通过在九个广泛使用的基准数据集上进行的大量实验,我们证明我们提出的方法始终优于几种现有最先进的方法。值得注意的是,在DBLP数据集上,EGRC-Net在调整兰德指数(ARI)上比最佳基线提高了超过11.99%。此外,我们的可扩展方法在ARI上提高了10.73%,同时内存使用减少了33.73%,运行时间减少了19.71%。EGRC-Net的代码将在https://github.com/ZhihaoPENG-CityU/EGRC-Net上公开提供。