State Key Laboratory of Public Big Data, College of Computer Science and Technology,Guizhou University, Guiyang, 550025, Guizhou, PR China.
State Key Laboratory of Public Big Data, College of Computer Science and Technology,Guizhou University, Guiyang, 550025, Guizhou, PR China.
Neural Netw. 2022 Nov;155:144-154. doi: 10.1016/j.neunet.2022.08.006. Epub 2022 Aug 17.
Structural deep clustering involves the use of neural networks for fusing semantic and structural representations for clustering tasks, and it has been receiving increasing attention. In some pioneering works, auto-encoder (AE)-specific representations were integrated with a graph convolutional network (GCN)-specific representation by delivering semantic information to the GCN module layer-by-layer. Although promising performance has been achieved in various applications, we observed that a vital aspect was overlooked in these works: the structural information may vanish in the learning process because of the over-smoothing problem of the GCN module, leading to non-representative features and, thus, deteriorating clustering performance. In this study, we address this issue by proposing a structure enhanced deep clustering network. The GCN-specific structural data representation is enhanced and supervised by its structural information. Specifically, the GCN-specific structural data representation is strengthened during the learning process by combining it with a structure enhanced semantic (SES) representation. A novel structure enhanced AE, named the weighted neighbourhood AE (wNAE), is employed to learn the SES representation for each data sample. Finally, we design a joint supervision strategy to uniformly guide the simultaneous learning of the wNAE and GCN modules and the clustering assignment. Experimental results for different datasets empirically validate the importance of semantic and neighbour-wise structure learning.
结构深度聚类涉及使用神经网络融合语义和结构表示进行聚类任务,它受到越来越多的关注。在一些开创性的工作中,通过将语义信息逐层传递给图卷积网络(GCN)模块,将自动编码器(AE)特定表示与 GCN 特定表示集成在一起。尽管在各种应用中取得了有希望的性能,但我们观察到这些工作中忽略了一个重要方面:由于 GCN 模块的过平滑问题,结构信息可能在学习过程中消失,导致特征不具有代表性,从而降低聚类性能。在这项研究中,我们通过提出一种结构增强的深度聚类网络来解决这个问题。通过其结构信息增强和监督 GCN 特定的结构数据表示。具体来说,通过将 GCN 特定的结构数据表示与结构增强的语义(SES)表示相结合,在学习过程中加强 GCN 特定的结构数据表示。我们采用了一种新的结构增强自动编码器,称为加权邻域自动编码器(wNAE),用于学习每个数据样本的 SES 表示。最后,我们设计了一种联合监督策略,以统一指导 wNAE 和 GCN 模块以及聚类分配的同时学习。来自不同数据集的实验结果经验性地验证了语义和邻域结构学习的重要性。