可解释的多层图神经网络在癌症基因预测中的应用。

Explainable Multilayer Graph Neural Network for cancer gene prediction.

机构信息

LIX, École Polytechnique, IP Paris, Rte de Saclay, Palaiseau, 91120, France.

Division of Artificial Intelligence in Medicine, Cedars-Sinai Medical Center, 116 N. Robertson Boulevard, Los Angeles, CA 90048, United States.

出版信息

Bioinformatics. 2023 Nov 1;39(11). doi: 10.1093/bioinformatics/btad643.

DOI:10.1093/bioinformatics/btad643

PMID:37862225

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10636280/

Abstract

MOTIVATION

The identification of cancer genes is a critical yet challenging problem in cancer genomics research. Existing computational methods, including deep graph neural networks, fail to exploit the multilayered gene-gene interactions or provide limited explanations for their predictions. These methods are restricted to a single biological network, which cannot capture the full complexity of tumorigenesis. Models trained on different biological networks often yield different and even opposite cancer gene predictions, hindering their trustworthy adaptation. Here, we introduce an Explainable Multilayer Graph Neural Network (EMGNN) approach to identify cancer genes by leveraging multiple gene-gene interaction networks and pan-cancer multi-omics data. Unlike conventional graph learning on a single biological network, EMGNN uses a multilayered graph neural network to learn from multiple biological networks for accurate cancer gene prediction.

RESULTS

Our method consistently outperforms all existing methods, with an average 7.15% improvement in area under the precision-recall curve over the current state-of-the-art method. Importantly, EMGNN integrated multiple graphs to prioritize newly predicted cancer genes with conflicting predictions from single biological networks. For each prediction, EMGNN provided valuable biological insights via both model-level feature importance explanations and molecular-level gene set enrichment analysis. Overall, EMGNN offers a powerful new paradigm of graph learning through modeling the multilayered topological gene relationships and provides a valuable tool for cancer genomics research.

AVAILABILITY AND IMPLEMENTATION

Our code is publicly available at https://github.com/zhanglab-aim/EMGNN.

摘要

动机

癌症基因的鉴定是癌症基因组学研究中的一个关键而具有挑战性的问题。现有的计算方法，包括深度图神经网络，无法利用多层次的基因-基因相互作用，或者为其预测提供有限的解释。这些方法仅限于单个生物网络，无法捕捉肿瘤发生的全部复杂性。在不同的生物网络上训练的模型通常会产生不同的，甚至相反的癌症基因预测，从而阻碍了它们的可信适应性。在这里，我们引入了一种可解释的多层图神经网络（EMGNN）方法，通过利用多个基因-基因相互作用网络和泛癌多组学数据来识别癌症基因。与在单个生物网络上进行传统图学习不同，EMGNN 使用多层图神经网络从多个生物网络中学习，以进行准确的癌症基因预测。

结果

我们的方法始终优于所有现有的方法，在平均精度-召回曲线下面积方面比当前最先进的方法平均提高了 7.15%。重要的是，EMGNN 集成了多个图谱，以优先考虑与单个生物网络的预测相冲突的新预测的癌症基因。对于每个预测，EMGNN 通过模型级特征重要性解释和分子级基因集富集分析提供了有价值的生物学见解。总体而言，EMGNN 通过对多层次拓扑基因关系进行建模，提供了一种强大的新图学习范例，并为癌症基因组学研究提供了有价值的工具。

可用性和实现

我们的代码可在 https://github.com/zhanglab-aim/EMGNN 上公开获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bb7/10636280/a4856183f010/btad643f1.jpg

相似文献

Explainable Multilayer Graph Neural Network for cancer gene prediction.可解释的多层图神经网络在癌症基因预测中的应用。

Bioinformatics. 2023 Nov 1;39(11). doi: 10.1093/bioinformatics/btad643.

DGMP: Identifying Cancer Driver Genes by Jointing DGCN and MLP from Multi-omics Genomic Data.DGMP：通过结合多组学基因组数据的 DGCN 和 MLP 识别癌症驱动基因。

Genomics Proteomics Bioinformatics. 2022 Oct;20(5):928-938. doi: 10.1016/j.gpb.2022.11.004. Epub 2022 Dec 1.

Explaining decisions of graph convolutional neural networks: patient-specific molecular subnetworks responsible for metastasis prediction in breast cancer.解释图卷积神经网络决策：乳腺癌转移预测中与患者特异性相关的分子子网络。

Genome Med. 2021 Mar 11;13(1):42. doi: 10.1186/s13073-021-00845-7.

A Graph Feature Auto-Encoder for the prediction of unobserved node features on biological networks.一种基于图特征自动编码器的生物网络中未观测节点特征预测方法。

BMC Bioinformatics. 2021 Oct 27;22(1):525. doi: 10.1186/s12859-021-04447-3.

FGCNSurv: dually fused graph convolutional network for multi-omics survival prediction.FGCNSurv：用于多组学生存预测的双重融合图卷积网络。

Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad472.

Unsupervised construction of computational graphs for gene expression data with explicit structural inductive biases.无监督构建具有显式结构归纳偏差的基因表达数据的计算图。

Bioinformatics. 2022 Feb 7;38(5):1320-1327. doi: 10.1093/bioinformatics/btab830.

Self-Explainable Graph Neural Network for Alzheimer Disease and Related Dementias Risk Prediction: Algorithm Development and Validation Study.可解释图神经网络在阿尔茨海默病及相关痴呆风险预测中的应用：算法开发与验证研究。

JMIR Aging. 2024 Jul 8;7:e54748. doi: 10.2196/54748.

Pre-training graph neural networks for link prediction in biomedical networks.用于生物医学网络中链接预测的预训练图神经网络。

Bioinformatics. 2022 Apr 12;38(8):2254-2262. doi: 10.1093/bioinformatics/btac100.

PiLSL: pairwise interaction learning-based graph neural network for synthetic lethality prediction in human cancers.PiLSL：基于成对交互学习的图神经网络在人类癌症中的合成致死预测。

Bioinformatics. 2022 Sep 16;38(Suppl_2):ii106-ii112. doi: 10.1093/bioinformatics/btac476.

REDDA: Integrating multiple biological relations to heterogeneous graph neural network for drug-disease association prediction.REDDA：将多种生物关系整合到异构图神经网络中用于药物-疾病关联预测。

Comput Biol Med. 2022 Nov;150:106127. doi: 10.1016/j.compbiomed.2022.106127. Epub 2022 Sep 22.

引用本文的文献

Causality-aware graph neural networks for functional stratification and phenotype prediction at scale.用于大规模功能分层和表型预测的因果感知图神经网络

NPJ Syst Biol Appl. 2025 Aug 12;11(1):92. doi: 10.1038/s41540-025-00567-1.

DGHNN: a deep graph and hypergraph neural network for pan-cancer related gene prediction.DGHNN：一种用于泛癌相关基因预测的深度图与超图神经网络

Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf379.

GNNMutation: a heterogeneous graph-based framework for cancer detection.GNNMutation：一种基于异构图的癌症检测框架。

BMC Bioinformatics. 2025 Jun 4;26(1):153. doi: 10.1186/s12859-025-06133-0.

Immuno-oncology recapitulates ontogeny: Modern cell and gene therapy for cancer.免疫肿瘤学重现个体发生：癌症的现代细胞和基因疗法。

Mol Ther. 2025 May 7;33(5):2229-2237. doi: 10.1016/j.ymthe.2025.03.042. Epub 2025 Mar 27.

Comparative Analysis of Multi-Omics Integration Using Graph Neural Networks for Cancer Classification.使用图神经网络进行癌症分类的多组学整合的比较分析

IEEE Access. 2025;13:37724-37736. doi: 10.1109/access.2025.3540769. Epub 2025 Feb 11.

Enhancing Molecular Network-Based Cancer Driver Gene Prediction Using Machine Learning Approaches: Current Challenges and Opportunities.使用机器学习方法增强基于分子网络的癌症驱动基因预测：当前挑战与机遇

J Cell Mol Med. 2025 Jan;29(1):e70351. doi: 10.1111/jcmm.70351.

Designing interpretable deep learning applications for functional genomics: a quantitative analysis.设计可解释的深度学习应用于功能基因组学：一项定量分析。

Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae449.

Multimodal data integration for oncology in the era of deep neural networks: a review.深度神经网络时代肿瘤学中的多模态数据整合：综述

Front Artif Intell. 2024 Jul 25;7:1408843. doi: 10.3389/frai.2024.1408843. eCollection 2024.

本文引用的文献

GSEApy: a comprehensive package for performing gene set enrichment analysis in Python.GSEApy：一个用于在 Python 中进行基因集富集分析的综合软件包。

Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac757.

GNN-SubNet: disease subnetwork detection with explainable graph neural networks.GNN-SubNet：基于可解释图神经网络的疾病子网络检测。

Bioinformatics. 2022 Sep 16;38(Suppl_2):ii120-ii126. doi: 10.1093/bioinformatics/btac478.

Genome-wide mapping of somatic mutation rates uncovers drivers of cancer.全基因组范围内体细胞突变率的绘制揭示了癌症的驱动因素。

Nat Biotechnol. 2022 Nov;40(11):1634-1643. doi: 10.1038/s41587-022-01353-8. Epub 2022 Jun 20.

Oxid Med Cell Longev. 2022 Jan 17;2022:6419695. doi: 10.1155/2022/6419695. eCollection 2022.

High BRCA1 gene expression increases the risk of early distant metastasis in ER breast cancers.高 BRCA1 基因表达增加了 ER 型乳腺癌早期远处转移的风险。

Sci Rep. 2022 Jan 7;12(1):77. doi: 10.1038/s41598-021-03471-w.

COL5A1 Serves as a Biomarker of Tumor Progression and Poor Prognosis and May Be a Potential Therapeutic Target in Gliomas.COL5A1作为肿瘤进展和不良预后的生物标志物，可能是胶质瘤的潜在治疗靶点。

Front Oncol. 2021 Nov 16;11:752694. doi: 10.3389/fonc.2021.752694. eCollection 2021.

Biologically informed deep neural network for prostate cancer discovery.基于生物学信息的深度神经网络在前列腺癌诊断中的应用

Nature. 2021 Oct;598(7880):348-352. doi: 10.1038/s41586-021-03922-4. Epub 2021 Sep 22.

GeneWalk identifies relevant gene functions for a biological context using network representation learning.GeneWalk 使用网络表示学习来确定生物背景下相关的基因功能。

Genome Biol. 2021 Feb 2;22(1):55. doi: 10.1186/s13059-021-02264-8.

Deciphering molecular interactions by proximity labeling.通过邻近标记技术解析分子相互作用。

Nat Methods. 2021 Feb;18(2):133-143. doi: 10.1038/s41592-020-01010-5. Epub 2021 Jan 11.

Exploring genetic interaction manifolds constructed from rich single-cell phenotypes.探索由丰富的单细胞表型构建的遗传相互作用流形。

Science. 2019 Aug 23;365(6455):786-793. doi: 10.1126/science.aax4438. Epub 2019 Aug 8.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

可解释的多层图神经网络在癌症基因预测中的应用。

Explainable Multilayer Graph Neural Network for cancer gene prediction.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

动机

结果

可用性和实现

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献