University of Chinese Academy of Sciences (UCAS), Beijing, China.
University of Chinese Academy of Sciences (UCAS), Beijing, China; Key Laboratory of Big Data Mining and Knowledge Management, CAS, Beijing, China.
Neural Netw. 2023 May;162:412-424. doi: 10.1016/j.neunet.2023.03.015. Epub 2023 Mar 11.
With the development of graph neural networks, how to handle large-scale graph data has become an increasingly important topic. Currently, most graph neural network models which can be extended to large-scale graphs are based on random sampling methods. However, the sampling process in these models is detached from the forward propagation of neural networks. Moreover, quite a few works design sampling based on statistical estimation methods for graph convolutional networks and the weights of message passing in GCNs nodes are fixed, making these sampling methods not scalable to message passing networks with variable weights, such as graph attention networks. Noting the end-to-end learning capability of neural networks, we propose a learnable sampling method. It solves the problem that random sampling operations cannot calculate gradients and samples nodes with an unfixed probability. In this way, the sampling process is dynamically combined with the forward propagation process of the features, allowing for better training of the networks. And it can be generalized to all message passing models. In addition, we apply the learnable sampling method to GNNs and propose two models. Our method can be flexibly combined with different graph neural network models and achieves excellent accuracy on benchmark datasets with large graphs. Meanwhile, loss function converges to smaller values at a faster rate during training than past methods.
随着图神经网络的发展,如何处理大规模图数据已成为一个日益重要的课题。目前,大多数可扩展到大规模图的图神经网络模型都是基于随机采样方法的。然而,这些模型中的采样过程与神经网络的前向传播是分离的。此外,相当多的工作基于统计估计方法设计了图卷积网络的采样方法,并且 GCN 节点中的消息传递权重是固定的,这使得这些采样方法不适用于权重可变的消息传递网络,如图注意网络。鉴于神经网络的端到端学习能力,我们提出了一种可学习的采样方法。它解决了随机采样操作无法计算梯度和以固定概率采样节点的问题。这样,采样过程就可以与特征的前向传播过程动态结合,从而更好地训练网络。并且它可以推广到所有的消息传递模型。此外,我们将可学习的采样方法应用于 GNN 并提出了两个模型。我们的方法可以灵活地与不同的图神经网络模型结合,并在具有大图的基准数据集上取得优异的准确性。同时,与过去的方法相比,训练过程中的损失函数收敛到更小的值的速度更快。