可解释人工智能模型揭示用于癌症类型分类的信息性突变特征。

Explainable AI Model Reveals Informative Mutational Signatures for Cancer-Type Classification.

作者信息

Wagner Jonas, Oldenburg Jan, Nath Neetika, Simm Stefan

机构信息

Institute of Bioinformatics, University Medicine Greifswald, 17475 Greifswald, Germany.

Institute of Bioanalysis, Department of Applied Sciences, Coburg University of Applied Sciences and Arts, 96450 Coburg, Germany.

出版信息

Cancers (Basel). 2025 May 22;17(11):1731. doi: 10.3390/cancers17111731.

DOI:10.3390/cancers17111731

PMID:40507213

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12153866/

Abstract

: The prediction of cancer types is primarily reliant on driver genes and their specific mutations. The advancement in novel omics technologies has led to the acquisition of additional genetic data. When integrated with artificial intelligence models, there is considerable potential for this to enhance the accuracy of cancer diagnosis. As mutational signatures can provide insights into repair mechanism malfunctions, they also have the potential for more accurate cancer diagnosis. : First, we compared unsupervised and supervised machine learning approaches to predict cancer types. We employed deep and artificial neural network architectures with an explainable component like layerwise relevance propagation to extract the most relevant features for the cancer-type prediction. Ten-fold cross-validation and an extensive grid search were used to optimize the neural network architecture using driver gene mutations, mutational signatures and topological mutation information as input. The PCAWG dataset was used as input to discriminate between 17 primary sites and 24 cancer types. : Overall, our approach showed that the most relevant mutation information to discriminate between cancer types is increased by >10% using the whole genome or intergenic and intronic genome regions instead of exome information. Furthermore, the most relevant features for most cancer types, except for two, are in the mutational signatures and not the topological mutation information. : Informative mutational signatures outperformed the prediction of cancer types in comparison to driver gene mutations and added a new layer of diagnostic information. As the degree of information within the mutational signatures is not solely based on the frequency of occurrence, it is even possible to separate cancer types from the same primary site by the different relevant mutations. Furthermore, the comparison of informative mutational signatures allowed the cancer-type assignment of specific impaired repair mechanisms.

摘要

癌症类型的预测主要依赖于驱动基因及其特定突变。新型组学技术的进步使得获取更多的遗传数据成为可能。当与人工智能模型相结合时，这在提高癌症诊断准确性方面具有巨大潜力。由于突变特征可以提供有关修复机制故障的见解，它们在更准确的癌症诊断方面也具有潜力。

首先，我们比较了无监督和有监督的机器学习方法来预测癌症类型。我们采用了具有可解释组件（如逐层相关性传播）的深度和人工神经网络架构，以提取与癌症类型预测最相关的特征。使用十倍交叉验证和广泛的网格搜索，以驱动基因突变、突变特征和拓扑突变信息作为输入来优化神经网络架构。PCAWG数据集用作输入，以区分17个主要部位和24种癌症类型。

总体而言，我们的方法表明，使用全基因组或基因间和内含子基因组区域而非外显子信息来区分癌症类型时，最相关的突变信息增加了10%以上。此外，除了两种癌症类型外，大多数癌症类型最相关的特征在于突变特征而非拓扑突变信息。

与驱动基因突变相比，信息丰富的突变特征在癌症类型预测方面表现更优，并增加了一层新的诊断信息。由于突变特征中的信息程度并非仅基于出现频率，甚至有可能通过不同的相关突变将来自同一主要部位的癌症类型区分开来。此外，对信息丰富的突变特征进行比较有助于确定特定受损修复机制的癌症类型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3546/12153866/073337b436dd/cancers-17-01731-g001.jpg

相似文献

Explainable AI Model Reveals Informative Mutational Signatures for Cancer-Type Classification.可解释人工智能模型揭示用于癌症类型分类的信息性突变特征。

Cancers (Basel). 2025 May 22;17(11):1731. doi: 10.3390/cancers17111731.

Mutational signatures for breast cancer diagnosis using artificial intelligence.使用人工智能进行乳腺癌诊断的突变特征

J Egypt Natl Canc Inst. 2023 May 15;35(1):14. doi: 10.1186/s43046-023-00173-4.

Analysis of 7,815 cancer exomes reveals associations between mutational processes and somatic driver mutations.对 7815 个癌症外显子组的分析揭示了突变过程与体细胞驱动突变之间的关联。

PLoS Genet. 2018 Nov 9;14(11):e1007779. doi: 10.1371/journal.pgen.1007779. eCollection 2018 Nov.

Machine Learning Classification and Structure-Functional Analysis of Cancer Mutations Reveal Unique Dynamic and Network Signatures of Driver Sites in Oncogenes and Tumor Suppressor Genes.机器学习分类和癌症突变的结构-功能分析揭示了癌基因和肿瘤抑制基因中驱动位点的独特动态和网络特征。

J Chem Inf Model. 2018 Oct 22;58(10):2131-2150. doi: 10.1021/acs.jcim.8b00414. Epub 2018 Oct 3.

DiaDeL: An Accurate Deep Learning-Based Model With Mutational Signatures for Predicting Metastasis Stage and Cancer Types.DiaDeL：一种基于深度学习的具有突变特征的模型，用于预测转移阶段和癌症类型。

IEEE/ACM Trans Comput Biol Bioinform. 2022 May-Jun;19(3):1336-1343. doi: 10.1109/TCBB.2021.3115504. Epub 2022 Jun 3.

Computational Methods Summarizing Mutational Patterns in Cancer: Promise and Limitations for Clinical Applications.总结癌症突变模式的计算方法：临床应用的前景与局限

Cancers (Basel). 2023 Mar 24;15(7):1958. doi: 10.3390/cancers15071958.

Pan-cancer association of DNA repair deficiencies with whole-genome mutational patterns.全癌种中 DNA 修复缺陷与全基因组突变模式的关联。

Elife. 2023 Mar 8;12:e81224. doi: 10.7554/eLife.81224.

Decoding whole-genome mutational signatures in 37 human pan-cancers by denoising sparse autoencoder neural network.利用去噪稀疏自动编码器神经网络对 37 种人类泛癌进行全基因组突变特征解码。

Oncogene. 2020 Jul;39(27):5031-5041. doi: 10.1038/s41388-020-1343-z. Epub 2020 Jun 11.

Deep learning model accurately classifies metastatic tumors from primary tumors based on mutational signatures.深度学习模型基于突变特征准确地区分转移性肿瘤和原发性肿瘤。

Sci Rep. 2023 May 30;13(1):8752. doi: 10.1038/s41598-023-35842-w.

A novel approach of brain-computer interfacing (BCI) and Grad-CAM based explainable artificial intelligence: Use case scenario for smart healthcare.一种新的脑机接口 (BCI) 和基于 Grad-CAM 的可解释人工智能方法：智能医疗保健用例场景。

J Neurosci Methods. 2024 Aug;408:110159. doi: 10.1016/j.jneumeth.2024.110159. Epub 2024 May 7.

本文引用的文献

Mutation impact on mRNA versus protein expression across human cancers.突变对人类癌症中mRNA与蛋白质表达的影响。

Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giae113.

XModNN: Explainable Modular Neural Network to Identify Clinical Parameters and Disease Biomarkers in Transcriptomic Datasets.XModNN：用于在转录组数据集中识别临床参数和疾病生物标志物的可解释模块化神经网络。

Biomolecules. 2024 Nov 25;14(12):1501. doi: 10.3390/biom14121501.

Review of immunohistochemistry techniques: Applications, current status, and future perspectives.免疫组织化学技术综述：应用、现状与未来展望。

Semin Diagn Pathol. 2024 May;41(3):154-160. doi: 10.1053/j.semdp.2024.05.001. Epub 2024 May 3.

Deep-Learning Model for Tumor-Type Prediction Using Targeted Clinical Genomic Sequencing Data.基于靶向临床基因组测序数据的肿瘤类型预测深度学习模型。

Cancer Discov. 2024 Jun 3;14(6):1064-1081. doi: 10.1158/2159-8290.CD-23-0996.

Accurate and sensitive mutational signature analysis with MuSiCal.使用 MuSiCal 进行准确且灵敏的突变特征分析。

Nat Genet. 2024 Mar;56(3):541-552. doi: 10.1038/s41588-024-01659-0. Epub 2024 Feb 15.

Integrative analysis of RNA expression data unveils distinct cancer types through machine learning techniques.通过机器学习技术对RNA表达数据进行综合分析可揭示不同的癌症类型。

Saudi J Biol Sci. 2024 Mar;31(3):103918. doi: 10.1016/j.sjbs.2023.103918. Epub 2023 Dec 30.

COSMIC: a curated database of somatic variants and clinical data for cancer.COSMIC：一个针对癌症体细胞变异和临床数据的精选数据库。

Nucleic Acids Res. 2024 Jan 5;52(D1):D1210-D1217. doi: 10.1093/nar/gkad986.

Classification of tumor types using XGBoost machine learning model: a vector space transformation of genomic alterations.使用 XGBoost 机器学习模型对肿瘤类型进行分类：基因组改变的向量空间变换。

J Transl Med. 2023 Nov 21;21(1):836. doi: 10.1186/s12967-023-04720-4.

DriverDBv4: a multi-omics integration database for cancer driver gene research.DriverDBv4：一个用于癌症驱动基因研究的多组学整合数据库。

Nucleic Acids Res. 2024 Jan 5;52(D1):D1246-D1252. doi: 10.1093/nar/gkad1060.

Uncovering novel mutational signatures by extraction with SigProfilerExtractor.通过SigProfilerExtractor提取来揭示新的突变特征。

Cell Genom. 2022 Nov 9;2(11):None. doi: 10.1016/j.xgen.2022.100179.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

可解释人工智能模型揭示用于癌症类型分类的信息性突变特征。

Explainable AI Model Reveals Informative Mutational Signatures for Cancer-Type Classification.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献