Xie Tian, Wang Bin, Kuo C-C Jay
IEEE Trans Neural Netw Learn Syst. 2023 Nov;34(11):9287-9301. doi: 10.1109/TNNLS.2022.3157746. Epub 2023 Oct 27.
A scalable semisupervised node classification method on graph-structured data, called GraphHop, is proposed in this work. The graph contains all nodes' attributes and link connections but labels of only a subset of nodes. Graph convolutional networks (GCNs) have provided superior performance in node label classification over the traditional label propagation (LP) methods for this problem. Nevertheless, current GCN algorithms suffer from a considerable amount of labels for training because of high model complexity or cannot be easily generalized to large-scale graphs due to the expensive cost of loading the entire graph and node embeddings. Besides, nonlinearity makes the optimization process a mystery. To this end, an enhanced LP method, called GraphHop, is proposed to tackle these problems. GraphHop can be viewed as a smoothening LP algorithm, in which each propagation alternates between two steps: label aggregation and label update. In the label aggregation step, multihop neighbor embeddings are aggregated to the center node. In the label update step, new embeddings are learned and predicted for each node based on aggregated results from the previous step. The two-step iteration improves the graph signal smoothening capacity. Furthermore, to encode attributes, links, and labels on graphs effectively under one framework, we adopt a two-stage training process, i.e., the initialization stage and the iteration stage. Thus, the smooth attribute information extracted from the initialization stage is consistently imposed in the propagation process in the iteration stage. Experimental results show that GraphHop outperforms state-of-the-art graph learning methods on a wide range of tasks in graphs of various sizes (e.g., multilabel and multiclass classification on citation networks, social graphs, and commodity consumption graphs).
本文提出了一种基于图结构数据的可扩展半监督节点分类方法——GraphHop。该图包含所有节点的属性和链接连接,但只有一部分节点有标签。对于这个问题,图卷积网络(GCN)在节点标签分类方面比传统的标签传播(LP)方法表现更优。然而,由于模型复杂度高,当前的GCN算法在训练时需要大量标签,或者由于加载整个图和节点嵌入的成本过高,无法轻易推广到大规模图。此外,非线性使得优化过程变得难以捉摸。为此,本文提出了一种增强的LP方法——GraphHop来解决这些问题。GraphHop可以看作是一种平滑的LP算法,其中每次传播在两个步骤之间交替进行:标签聚合和标签更新。在标签聚合步骤中,多跳邻居嵌入被聚合到中心节点。在标签更新步骤中,根据上一步的聚合结果为每个节点学习并预测新的嵌入。两步迭代提高了图信号的平滑能力。此外,为了在一个框架下有效地对图上的属性、链接和标签进行编码,我们采用了两阶段训练过程,即初始化阶段和迭代阶段。因此,在初始化阶段提取的平滑属性信息在迭代阶段的传播过程中被持续施加。实验结果表明,在各种大小的图上的广泛任务(例如,在引文网络、社交图和商品消费图上的多标签和多类分类)中,GraphHop优于现有的图学习方法。