Shi Gen, Zhu Yifan, Liu Jian K, Li Xuesong
IEEE Trans Neural Netw Learn Syst. 2024 Oct;35(10):13914-13925. doi: 10.1109/TNNLS.2023.3273255. Epub 2024 Oct 7.
Representation learning in heterogeneous graphs with massive unlabeled data has aroused great interest. The heterogeneity of graphs not only contains rich information, but also raises difficult barriers to designing unsupervised or self-supervised learning (SSL) strategies. Existing methods such as random walk-based approaches are mainly dependent on the proximity information of neighbors and lack the ability to integrate node features into a higher-level representation. Furthermore, previous self-supervised or unsupervised frameworks are usually designed for node-level tasks, which are commonly short of capturing global graph properties and may not perform well in graph-level tasks. Therefore, a label-free framework that can better capture the global properties of heterogeneous graphs is urgently required. In this article, we propose a self-supervised heterogeneous graph neural network (GNN) based on cross-view contrastive learning (HeGCL). The HeGCL presents two views for encoding heterogeneous graphs: the meta-path view and the outline view. Compared with the meta-path view that provides semantic information, the outline view encodes the complex edge relations and captures graph-level properties by using a nonlocal block. Thus, the HeGCL learns node embeddings through maximizing mutual information (MI) between global and semantic representations coming from the outline and meta-path view, respectively. Experiments on both node-level and graph-level tasks show the superiority of the proposed model over other methods, and further exploration studies also show that the introduction of nonlocal block brings a significant contribution to graph-level tasks.
具有大量未标记数据的异构图中的表示学习引起了极大的关注。图的异质性不仅包含丰富的信息,也给无监督或自监督学习(SSL)策略的设计带来了困难。现有的方法,如基于随机游走的方法,主要依赖于邻居的近邻信息,缺乏将节点特征整合到更高层次表示的能力。此外,以前的自监督或无监督框架通常是为节点级任务设计的,通常缺乏捕获全局图属性的能力,并且在图级任务中可能表现不佳。因此,迫切需要一个能够更好地捕获异构图全局属性的无标签框架。在本文中,我们提出了一种基于交叉视图对比学习的自监督异构图神经网络(GNN)(HeGCL)。HeGCL提出了两种对异构图进行编码的视图:元路径视图和轮廓视图。与提供语义信息的元路径视图相比,轮廓视图通过使用非局部块对复杂的边关系进行编码并捕获图级属性。因此,HeGCL通过分别最大化来自轮廓视图和元路径视图的全局和语义表示之间的互信息(MI)来学习节点嵌入。在节点级和图级任务上的实验表明,所提出的模型优于其他方法,进一步的探索研究还表明,非局部块的引入对图级任务有显著贡献。