Suppr超能文献

WalkLM:一种用于属性图嵌入的统一语言模型微调框架。

WalkLM: A Uniform Language Model Fine-tuning Framework for Attributed Graph Embedding.

作者信息

Tan Yanchao, Zhou Zihao, Lv Hang, Liu Weiming, Yang Carl

机构信息

College of Computer and Data Science, Fuzhou University, Fuzhou, China.

College of Computer Science, Zhejiang University, Hangzhou, China.

出版信息

Adv Neural Inf Process Syst. 2023 Dec;36:13308-13325. Epub 2024 May 30.

Abstract

Graphs are widely used to model interconnected entities and improve downstream predictions in various real-world applications. However, real-world graphs nowadays are often associated with complex attributes on multiple types of nodes and even links that are hard to model uniformly, while the widely used graph neural networks (GNNs) often require sufficient training toward specific downstream predictions to achieve strong performance. In this work, we take a fundamentally different approach than GNNs, to simultaneously achieve deep joint modeling of complex attributes and flexible structures of real-world graphs and obtain unsupervised generic graph representations that are not limited to specific downstream predictions. Our framework, built on a natural integration of language models (LMs) and random walks (RWs), is straightforward, powerful and data-efficient. Specifically, we first perform attributed RWs on the graph and design an automated program to compose roughly meaningful textual sequences directly from the attributed RWs; then we fine-tune an LM using the RW-based textual sequences and extract embedding vectors from the LM, which encapsulates both attribute semantics and graph structures. In our experiments, we evaluate the learned node embeddings towards different downstream prediction tasks on multiple real-world attributed graph datasets and observe significant improvements over a comprehensive set of state-of-the-art unsupervised node embedding methods. We believe this work opens a door for more sophisticated technical designs and empirical evaluations toward the leverage of LMs for the modeling of real-world graphs.

摘要

图被广泛用于对相互关联的实体进行建模,并在各种实际应用中改进下游预测。然而,如今的现实世界图通常与多种类型节点甚至链接上的复杂属性相关联,这些属性难以统一建模,而广泛使用的图神经网络(GNN)通常需要针对特定的下游预测进行充分训练才能取得强大的性能。在这项工作中,我们采用了一种与GNN截然不同的方法,以同时实现对现实世界图的复杂属性和灵活结构的深度联合建模,并获得不限于特定下游预测的无监督通用图表示。我们的框架基于语言模型(LM)和随机游走(RW)的自然融合构建,简单、强大且数据高效。具体来说,我们首先在图上执行带属性的随机游走,并设计一个自动化程序,直接从带属性的随机游走中组合出大致有意义的文本序列;然后我们使用基于随机游走的文本序列对语言模型进行微调,并从语言模型中提取嵌入向量,该向量封装了属性语义和图结构。在我们的实验中,我们在多个现实世界的带属性图数据集上针对不同的下游预测任务评估学习到的节点嵌入,并观察到相对于一整套全面的无监督节点嵌入方法有显著改进。我们相信这项工作为利用语言模型对现实世界图进行建模的更复杂技术设计和实证评估打开了一扇门。

相似文献

4
muxGNN: Multiplex Graph Neural Network for Heterogeneous Graphs.muxGNN:用于异构图的多路复用图神经网络。
IEEE Trans Pattern Anal Mach Intell. 2023 Sep;45(9):11067-11078. doi: 10.1109/TPAMI.2023.3263079. Epub 2023 Aug 7.
5
Attributed graph clustering with multi-task embedding learning.基于多任务嵌入学习的归因图聚类。
Neural Netw. 2022 Aug;152:224-233. doi: 10.1016/j.neunet.2022.04.018. Epub 2022 Apr 20.
6
7
Generalizing Graph Neural Networks on Out-of-Distribution Graphs.将图神经网络推广到分布外的图上。
IEEE Trans Pattern Anal Mach Intell. 2024 Jan;46(1):322-337. doi: 10.1109/TPAMI.2023.3321097. Epub 2023 Dec 5.

本文引用的文献

2
A Comprehensive Survey on Graph Neural Networks.图神经网络综述。
IEEE Trans Neural Netw Learn Syst. 2021 Jan;32(1):4-24. doi: 10.1109/TNNLS.2020.2978386. Epub 2021 Jan 4.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验