Suppr超能文献

从效用和隐私角度重新思考噪声标签在图分类中的影响

Rethinking the impact of noisy labels in graph classification: A utility and privacy perspective.

作者信息

Li De, Li Xianxian, Gan Zeming, Li Qiyu, Qu Bin, Wang Jinyan

机构信息

Key Lab of Education Blockchain and Intelligent Technology, Ministry of Education, Guangxi Normal University, Guilin, 541004, China; Guangxi Key Lab of Multi-source Information Mining & Security, Guangxi Normal University, Guilin, 541004, China; School of Computer Science and Engineering, Guangxi Normal University, Guilin, 541004, China.

School of Computer Science and Engineering, Guangxi Normal University, Guilin, 541004, China.

出版信息

Neural Netw. 2025 Feb;182:106919. doi: 10.1016/j.neunet.2024.106919. Epub 2024 Nov 20.

Abstract

Graph neural networks (GNNs) based on message-passing mechanisms have achieved advanced results in graph classification tasks. However, their generalization performance degrades when noisy labels are present in the training data. Most existing noisy labeling approaches focus on the visual domain or graph node classification tasks and analyze the impact of noisy labels only from a utility perspective. Unlike existing work, in this paper, we measure the effects of noise labels on graph classification from data privacy and model utility perspectives. We find that noise labels degrade the model's generalization performance and enhance the ability of membership inference attacks on graph data privacy. To this end, we propose the robust graph neural network (RGLC) approach with noisy labeled graph classification. Specifically, we first accurately filter the noisy samples by high-confidence samples and the first feature principal component vector of each class. Then, the robust principal component vectors and the model output under data augmentation are utilized to achieve noise label correction guided by dual spatial information. Finally, supervised graph contrastive learning is introduced to enhance the embedding quality of the model and protect the privacy of the training graph data. The utility and privacy of the proposed method are validated by comparing twelve different methods on eight real graph classification datasets. Compared with the state-of-the-art methods, the RGLC method achieves at most and at least 7.8% and 0.8% performance gain at 30% noisy labeling rate, respectively, and reduces the accuracy of privacy attacks to below 60%.

摘要

基于消息传递机制的图神经网络(GNN)在图分类任务中取得了先进的成果。然而,当训练数据中存在噪声标签时,它们的泛化性能会下降。大多数现有的噪声标注方法集中在视觉领域或图节点分类任务上,并且仅从效用的角度分析噪声标签的影响。与现有工作不同,在本文中,我们从数据隐私和模型效用的角度衡量噪声标签对图分类的影响。我们发现噪声标签会降低模型的泛化性能,并增强对图数据隐私的成员推理攻击能力。为此,我们提出了用于噪声标注图分类的鲁棒图神经网络(RGLC)方法。具体来说,我们首先通过高置信度样本和每个类别的第一特征主成分向量准确地过滤噪声样本。然后,利用鲁棒主成分向量和数据增强下的模型输出,在双重空间信息的引导下实现噪声标签校正。最后,引入监督图对比学习以提高模型的嵌入质量并保护训练图数据的隐私。通过在八个真实图分类数据集上比较十二种不同的方法,验证了所提出方法的效用和隐私性。与最先进的方法相比,RGLC方法在30%的噪声标注率下分别最多和最少实现了7.8%和0.8%的性能提升,并将隐私攻击的准确率降低到60%以下。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验