Suppr超能文献

典型的图对比学习

Prototypical Graph Contrastive Learning.

作者信息

Lin Shuai, Liu Chen, Zhou Pan, Hu Zi-Yuan, Wang Shuojia, Zhao Ruihui, Zheng Yefeng, Lin Liang, Xing Eric, Liang Xiaodan

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Feb;35(2):2747-2758. doi: 10.1109/TNNLS.2022.3191086. Epub 2024 Feb 5.

Abstract

Graph-level representations are critical in various real-world applications, such as predicting the properties of molecules. However, in practice, precise graph annotations are generally very expensive and time-consuming. To address this issue, graph contrastive learning constructs an instance discrimination task, which pulls together positive pairs (augmentation pairs of the same graph) and pushes away negative pairs (augmentation pairs of different graphs) for unsupervised representation learning. However, since for a query, its negatives are uniformly sampled from all graphs, existing methods suffer from the critical sampling bias issue, i.e., the negatives likely having the same semantic structure with the query, leading to performance degradation. To mitigate this sampling bias issue, in this article, we propose a prototypical graph contrastive learning (PGCL) approach. Specifically, PGCL models the underlying semantic structure of the graph data via clustering semantically similar graphs into the same group and simultaneously encourages the clustering consistency for different augmentations of the same graph. Then, given a query, it performs negative sampling via drawing the graphs from those clusters that differ from the cluster of query, which ensures the semantic difference between query and its negative samples. Moreover, for a query, PGCL further reweights its negative samples based on the distance between their prototypes (cluster centroids) and the query prototype such that those negatives having moderate prototype distance enjoy relatively large weights. This reweighting strategy is proven to be more effective than uniform sampling. Experimental results on various graph benchmarks testify the advantages of our PGCL over state-of-the-art methods. The code is publicly available at https://github.com/ha-lins/PGCL.

摘要

图级表示在各种实际应用中至关重要,例如预测分子的性质。然而,在实践中,精确的图注释通常非常昂贵且耗时。为了解决这个问题,图对比学习构建了一个实例判别任务,该任务将正例对(同一图的增强对)聚集在一起,并将负例对(不同图的增强对)推开,以进行无监督表示学习。然而,由于对于一个查询,其负例是从所有图中均匀采样的,现有方法存在严重的采样偏差问题,即负例可能与查询具有相同的语义结构,导致性能下降。为了减轻这种采样偏差问题,在本文中,我们提出了一种原型图对比学习(PGCL)方法。具体来说,PGCL通过将语义相似的图聚类到同一组中来对图数据的潜在语义结构进行建模,并同时鼓励同一图的不同增强的聚类一致性。然后,给定一个查询,它通过从与查询所在簇不同的那些簇中抽取图来进行负采样,这确保了查询与其负样本之间的语义差异。此外,对于一个查询,PGCL还根据其负样本的原型(簇中心)与查询原型之间的距离对其负样本进行重新加权,使得那些具有适度原型距离的负样本享有相对较大的权重。这种重新加权策略被证明比均匀采样更有效。在各种图基准上的实验结果证明了我们的PGCL相对于现有方法的优势。代码可在https://github.com/ha-lins/PGCL上公开获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验