Suppr超能文献

用于图自监督预训练的分层对比硬样本挖掘

Hierarchically Contrastive Hard Sample Mining for Graph Self-Supervised Pretraining.

作者信息

Tu Wenxuan, Zhou Sihang, Liu Xinwang, Ge Chunpeng, Cai Zhiping, Liu Yue

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Nov;35(11):16748-16761. doi: 10.1109/TNNLS.2023.3297607. Epub 2024 Oct 29.

Abstract

Contrastive learning has recently emerged as a powerful technique for graph self-supervised pretraining (GSP). By maximizing the mutual information (MI) between a positive sample pair, the network is forced to extract discriminative information from graphs to generate high-quality sample representations. However, we observe that, in the process of MI maximization (Infomax), the existing contrastive GSP algorithms suffer from at least one of the following problems: 1) treat all samples equally during optimization and 2) fall into a single contrasting pattern within the graph. Consequently, the vast number of well-categorized samples overwhelms the representation learning process, and limited information is accumulated, thus deteriorating the learning capability of the network. To solve these issues, in this article, by fusing the information from different views and conducting hard sample mining in a hierarchically contrastive manner, we propose a novel GSP algorithm called hierarchically contrastive hard sample mining (HCHSM). The hierarchical property of this algorithm is manifested in two aspects. First, according to the results of multilevel MI estimation in different views, the MI-based hard sample selection (MHSS) module keeps filtering the easy nodes and drives the network to focus more on hard nodes. Second, to collect more comprehensive information for hard sample learning, we introduce a hierarchically contrastive scheme to sequentially force the learned node representations to involve multilevel intrinsic graph features. In this way, as the contrastive granularity goes finer, the complementary information from different levels can be uniformly encoded to boost the discrimination of hard samples and enhance the quality of the learned graph embedding. Extensive experiments on seven benchmark datasets indicate that the HCHSM performs better than other competitors on node classification and node clustering tasks. The source code of HCHSM is available at https://github.com/WxTu/HCHSM.

摘要

对比学习最近已成为一种用于图自监督预训练(GSP)的强大技术。通过最大化正样本对之间的互信息(MI),网络被迫从图中提取判别信息以生成高质量的样本表示。然而,我们观察到,在互信息最大化(信息最大化)过程中,现有的对比GSP算法至少存在以下问题之一:1)在优化过程中平等对待所有样本,以及2)陷入图内的单一对比模式。因此,大量分类良好的样本使表示学习过程不堪重负,积累的信息有限,从而降低了网络的学习能力。为了解决这些问题,在本文中,通过融合来自不同视图的信息并以分层对比的方式进行硬样本挖掘,我们提出了一种名为分层对比硬样本挖掘(HCHSM)的新型GSP算法。该算法的分层特性体现在两个方面。首先,根据不同视图中多级互信息估计的结果,基于互信息的硬样本选择(MHSS)模块不断过滤容易的节点,并驱动网络更多地关注硬节点。其次,为了收集更多用于硬样本学习的综合信息,我们引入了一种分层对比方案,以顺序地迫使学习到的节点表示包含多级内在图特征。通过这种方式,随着对比粒度变细,可以统一编码来自不同级别的互补信息,以增强硬样本的辨别力并提高学习到的图嵌入的质量。在七个基准数据集上进行的广泛实验表明,HCHSM在节点分类和节点聚类任务上的表现优于其他竞争对手。HCHSM的源代码可在https://github.com/WxTu/HCHSM获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验