关于图嵌入及其在图上机器学习问题中的应用的综述。

Survey on graph embeddings and their applications to machine learning problems on graphs.

作者信息

Makarov Ilya, Kiselev Dmitrii, Nikitinsky Nikita, Subelj Lovro

机构信息

HSE University, Moscow, Russia.

Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia.

出版信息

PeerJ Comput Sci. 2021 Feb 4;7:e357. doi: 10.7717/peerj-cs.357. eCollection 2021.

DOI:10.7717/peerj-cs.357

PMID:33817007

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7959646/

Abstract

Dealing with relational data always required significant computational resources, domain expertise and task-dependent feature engineering to incorporate structural information into a predictive model. Nowadays, a family of automated graph feature engineering techniques has been proposed in different streams of literature. So-called graph embeddings provide a powerful tool to construct vectorized feature spaces for graphs and their components, such as nodes, edges and subgraphs under preserving inner graph properties. Using the constructed feature spaces, many machine learning problems on graphs can be solved via standard frameworks suitable for vectorized feature representation. Our survey aims to describe the core concepts of graph embeddings and provide several taxonomies for their description. First, we start with the methodological approach and extract three types of graph embedding models based on matrix factorization, random-walks and deep learning approaches. Next, we describe how different types of networks impact the ability of models to incorporate structural and attributed data into a unified embedding. Going further, we perform a thorough evaluation of graph embedding applications to machine learning problems on graphs, among which are node classification, link prediction, clustering, visualization, compression, and a family of the whole graph embedding algorithms suitable for graph classification, similarity and alignment problems. Finally, we overview the existing applications of graph embeddings to computer science domains, formulate open problems and provide experiment results, explaining how different networks properties result in graph embeddings quality in the four classic machine learning problems on graphs, such as node classification, link prediction, clustering and graph visualization. As a result, our survey covers a new rapidly growing field of network feature engineering, presents an in-depth analysis of models based on network types, and overviews a wide range of applications to machine learning problems on graphs.

摘要

处理关系型数据总是需要大量的计算资源、领域专业知识以及依赖任务的特征工程，以便将结构信息整合到预测模型中。如今，不同文献流派中已经提出了一系列自动化的图特征工程技术。所谓的图嵌入为构建图及其组件（如节点、边和子图）的矢量化特征空间提供了一个强大的工具，同时保留图的内部属性。利用构建好的特征空间，许多关于图的机器学习问题可以通过适用于矢量化特征表示的标准框架来解决。我们的综述旨在描述图嵌入的核心概念，并为其描述提供几种分类法。首先，我们从方法论入手，基于矩阵分解、随机游走和深度学习方法提取出三种类型的图嵌入模型。接下来，我们描述不同类型的网络如何影响模型将结构数据和属性数据整合到统一嵌入中的能力。进一步地，我们对图嵌入在图的机器学习问题中的应用进行了全面评估，其中包括节点分类、链接预测、聚类、可视化、压缩，以及适用于图分类、相似性和对齐问题的一系列全图嵌入算法。最后，我们概述了图嵌入在计算机科学领域的现有应用，提出了开放性问题并提供了实验结果，解释了不同的网络属性如何在图的四个经典机器学习问题（如节点分类、链接预测、聚类和图可视化）中导致图嵌入的质量差异。因此，我们的综述涵盖了一个新的快速发展的网络特征工程领域，对基于网络类型的模型进行了深入分析，并概述了在图的机器学习问题上的广泛应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7423/7959646/7b523a8fc6b1/peerj-cs-07-357-g001.jpg

相似文献

Survey on graph embeddings and their applications to machine learning problems on graphs.关于图嵌入及其在图上机器学习问题中的应用的综述。

PeerJ Comput Sci. 2021 Feb 4;7:e357. doi: 10.7717/peerj-cs.357. eCollection 2021.

Graph Representation Learning and Its Applications: A Survey.图表示学习及其应用综述。

Sensors (Basel). 2023 Apr 21;23(8):4168. doi: 10.3390/s23084168.

Attributed graph clustering with multi-task embedding learning.基于多任务嵌入学习的归因图聚类。

Neural Netw. 2022 Aug;152:224-233. doi: 10.1016/j.neunet.2022.04.018. Epub 2022 Apr 20.

Fusion of text and graph information for machine learning problems on networks.用于网络机器学习问题的文本与图形信息融合

PeerJ Comput Sci. 2021 May 11;7:e526. doi: 10.7717/peerj-cs.526. eCollection 2021.

Temporal network embedding framework with causal anonymous walks representations.具有因果匿名游走表示的时间网络嵌入框架。

PeerJ Comput Sci. 2022 Jan 20;8:e858. doi: 10.7717/peerj-cs.858. eCollection 2022.

Co-Embedding of Nodes and Edges With Graph Neural Networks.节点和边的图神经网络联合嵌入。

IEEE Trans Pattern Anal Mach Intell. 2023 Jun;45(6):7075-7086. doi: 10.1109/TPAMI.2020.3029762. Epub 2023 May 5.

Learning and reasoning with graph data.利用图数据进行学习与推理。

Front Artif Intell. 2023 Aug 22;6:1124718. doi: 10.3389/frai.2023.1124718. eCollection 2023.

Graph representation learning in bioinformatics: trends, methods and applications.生物信息学中的图表示学习：趋势、方法和应用。

Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab340.

HPO2Vec+: Leveraging heterogeneous knowledge resources to enrich node embeddings for the Human Phenotype Ontology.HPO2Vec+：利用异构知识资源丰富人类表型本体的节点嵌入。

J Biomed Inform. 2019 Aug;96:103246. doi: 10.1016/j.jbi.2019.103246. Epub 2019 Jun 27.

FuseLinker: Leveraging LLM's pre-trained text embeddings and domain knowledge to enhance GNN-based link prediction on biomedical knowledge graphs.FuseLinker：利用大语言模型的预训练文本嵌入和领域知识增强基于图神经网络的生物医学知识图谱的链接预测。

J Biomed Inform. 2024 Oct;158:104730. doi: 10.1016/j.jbi.2024.104730. Epub 2024 Sep 24.

引用本文的文献

Aggregating multimodal cancer data across unaligned embedding spaces maintains tumor of origin signal.跨未对齐嵌入空间聚合多模态癌症数据可保留肿瘤起源信号。

bioRxiv. 2025 May 18:2025.05.14.653900. doi: 10.1101/2025.05.14.653900.

Enhancing Intelligent HVAC optimization with graph attention networks and stacking ensemble learning, a recommender system approach in Shenzhen Qianhai Smart Community.利用图注意力网络和堆叠集成学习增强智能暖通空调优化，深圳前海智能社区中的一种推荐系统方法。

Sci Rep. 2025 Feb 11;15(1):5119. doi: 10.1038/s41598-025-89776-6.

Vertical Memristive Crossbar Array for Multilayer Graph Embedding and Analysis.用于多层图嵌入与分析的垂直忆阻交叉阵列

Adv Mater. 2025 Mar;37(10):e2416988. doi: 10.1002/adma.202416988. Epub 2025 Jan 29.

A novel deep neural network-based technique for network embedding.一种基于深度神经网络的新型网络嵌入技术。

PeerJ Comput Sci. 2024 Nov 26;10:e2489. doi: 10.7717/peerj-cs.2489. eCollection 2024.

An embedding-based distance for temporal graphs.一种基于嵌入的时态图距离。

Nat Commun. 2024 Nov 17;15(1):9954. doi: 10.1038/s41467-024-54280-4.

Estimating network dimension when the spectrum struggles.当频谱出现问题时估计网络维度。

R Soc Open Sci. 2024 May 22;11(5):230898. doi: 10.1098/rsos.230898. eCollection 2024 May.

DDK-Linker: a network-based strategy identifies disease signals by linking high-throughput omics datasets to disease knowledge.DDK-Linker：一种基于网络的策略，通过将高通量组学数据集与疾病知识联系起来，识别疾病信号。

Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae111.

Graph embedding on mass spectrometry- and sequencing-based biomedical data.基于质谱和测序的生物医学数据的图嵌入。

BMC Bioinformatics. 2024 Jan 2;25(1):1. doi: 10.1186/s12859-023-05612-6.

Graph Representation Learning and Its Applications: A Survey.图表示学习及其应用综述。

Sensors (Basel). 2023 Apr 21;23(8):4168. doi: 10.3390/s23084168.

Temporal network embedding framework with causal anonymous walks representations.具有因果匿名游走表示的时间网络嵌入框架。

PeerJ Comput Sci. 2022 Jan 20;8:e858. doi: 10.7717/peerj-cs.858. eCollection 2022.

本文引用的文献

Topology and Content Co-Alignment Graph Convolutional Learning.拓扑与内容协同对齐图卷积学习

IEEE Trans Neural Netw Learn Syst. 2022 Dec;33(12):7899-7907. doi: 10.1109/TNNLS.2021.3084125. Epub 2022 Nov 30.

Dual network embedding for representing research interests in the link prediction problem on co-authorship networks.用于在共同作者网络的链接预测问题中表示研究兴趣的双网络嵌入

PeerJ Comput Sci. 2019 Jan 21;5:e172. doi: 10.7717/peerj-cs.172. eCollection 2019.

Using deep neural networks and biological subwords to detect protein S-sulfenylation sites.利用深度神经网络和生物子词检测蛋白质 S-亚磺化位点。

Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa128.

GNNExplainer: Generating Explanations for Graph Neural Networks.GNNExplainer：为图神经网络生成解释

Adv Neural Inf Process Syst. 2019 Dec;32:9240-9251.

A Comprehensive Survey on Graph Neural Networks.图神经网络综述。

IEEE Trans Neural Netw Learn Syst. 2021 Jan;32(1):4-24. doi: 10.1109/TNNLS.2020.2978386. Epub 2021 Jan 4.

Learning Graph Embedding With Adversarial Training Methods.使用对抗训练方法学习图嵌入

IEEE Trans Cybern. 2020 Jun;50(6):2475-2487. doi: 10.1109/TCYB.2019.2932096. Epub 2019 Sep 2.

How to Hide One's Relationships from Link Prediction Algorithms.如何对链接预测算法隐藏人际关系

Sci Rep. 2019 Aug 21;9(1):12208. doi: 10.1038/s41598-019-48583-6.

ET-GRU: using multi-layer gated recurrent units to identify electron transport proteins.ET-GRU：利用多层门控循环单元识别电子传输蛋白。

BMC Bioinformatics. 2019 Jul 6;20(1):377. doi: 10.1186/s12859-019-2972-5.

Identification of pathways associated with chemosensitivity through network embedding.通过网络嵌入识别与化疗敏感性相关的途径。

PLoS Comput Biol. 2019 Mar 20;15(3):e1006864. doi: 10.1371/journal.pcbi.1006864. eCollection 2019 Mar.

Network embedding in biomedical data science.生物医学数据科学中的网络嵌入

Brief Bioinform. 2020 Jan 17;21(1):182-197. doi: 10.1093/bib/bby117.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

关于图嵌入及其在图上机器学习问题中的应用的综述。

Survey on graph embeddings and their applications to machine learning problems on graphs.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献