Key Lab of Education Blockchain and Intelligent Technology, Ministry of Education, Guangxi Normal University, Guilin, 541004, China; Guangxi Key Lab of Multi-Source Information Mining and Security, Guangxi Normal University, Guilin, 541004, China; School of Computer Science and Engineering, Guangxi Normal University, Guilin, 541004, China.
School of Computer Science and Engineering, Guangxi Normal University, Guilin, 541004, China.
Neural Netw. 2024 Apr;172:106113. doi: 10.1016/j.neunet.2024.106113. Epub 2024 Jan 6.
In the domain of graph-structured data learning, semi-supervised node classification serves as a critical task, relying mainly on the information from unlabeled nodes and a minor fraction of labeled nodes for training. However, real-world graph-structured data often suffer from label noise, which significantly undermines the performance of Graph Neural Networks (GNNs). This problem becomes increasingly severe in situations where labels are scarce. To tackle this issue of sparse and noisy labels, we propose a novel approach Contrastive Robust Graph Neural Network (CR-GNN), Firstly, considering label sparsity and noise, we employ unsupervised contrastive loss and further incorporate homophily in the graph structure, thus introducing neighbor contrastive loss. Moreover, data augmentation is typically used to construct positive and negative samples in contrastive learning, which may result in inconsistent prediction outcomes. Based on this, we propose a dynamic cross-entropy loss, which selects the nodes with consistent predictions as reliable nodes for cross-entropy loss and benefits to mitigate the overfitting to labeling noise. Finally, we propose cross-space consistency to narrow the semantic gap between the contrast and classification spaces. Extensive experiments on multiple publicly available datasets demonstrate that CR-GNN notably outperforms existing methods in resisting label noise.
在图结构化数据学习领域,半监督节点分类是一项关键任务,主要依赖于未标记节点和少量标记节点的信息进行训练。然而,真实世界的图结构化数据通常存在标签噪声,这会显著降低图神经网络(GNN)的性能。在标签稀缺的情况下,这个问题变得更加严重。为了解决稀疏和嘈杂标签的问题,我们提出了一种新的方法——对比鲁棒图神经网络(CR-GNN)。首先,考虑到标签稀疏和噪声,我们采用无监督对比损失,并进一步在图结构中引入同质性,从而引入邻居对比损失。此外,对比学习通常用于构建正例和负例,这可能导致不一致的预测结果。基于此,我们提出了一种动态交叉熵损失,它选择具有一致预测的节点作为交叉熵损失的可靠节点,有助于减轻对标记噪声的过拟合。最后,我们提出了跨空间一致性,以缩小对比和分类空间之间的语义差距。在多个公开可用数据集上的广泛实验表明,CR-GNN 在抵抗标签噪声方面明显优于现有方法。