• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

验证基于 RNA-seq 数据训练的深度学习组织分类器的可解释性。

Verifying explainability of a deep learning tissue classifier trained on RNA-seq data.

机构信息

Max Kelsen, Brisbane, QLD, 4006, Australia.

QIMR Berghofer Medical Research Institute, Brisbane, QLD, 4006, Australia.

出版信息

Sci Rep. 2021 Jan 29;11(1):2641. doi: 10.1038/s41598-021-81773-9.

DOI:10.1038/s41598-021-81773-9
PMID:33514769
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7846764/
Abstract

For complex machine learning (ML) algorithms to gain widespread acceptance in decision making, we must be able to identify the features driving the predictions. Explainability models allow transparency of ML algorithms, however their reliability within high-dimensional data is unclear. To test the reliability of the explainability model SHapley Additive exPlanations (SHAP), we developed a convolutional neural network to predict tissue classification from Genotype-Tissue Expression (GTEx) RNA-seq data representing 16,651 samples from 47 tissues. Our classifier achieved an average F1 score of 96.1% on held-out GTEx samples. Using SHAP values, we identified the 2423 most discriminatory genes, of which 98.6% were also identified by differential expression analysis across all tissues. The SHAP genes reflected expected biological processes involved in tissue differentiation and function. Moreover, SHAP genes clustered tissue types with superior performance when compared to all genes, genes detected by differential expression analysis, or random genes. We demonstrate the utility and reliability of SHAP to explain a deep learning model and highlight the strengths of applying ML to transcriptome data.

摘要

为了让复杂的机器学习 (ML) 算法在决策制定中得到广泛应用,我们必须能够识别出推动预测的特征。可解释性模型允许 ML 算法具有透明度,但它们在高维数据中的可靠性尚不清楚。为了测试可解释性模型 SHapley Additive exPlanations (SHAP) 的可靠性,我们开发了一个卷积神经网络,用于从代表来自 47 种组织的 16651 个样本的 Genotype-Tissue Expression (GTEx) RNA-seq 数据中预测组织分类。我们的分类器在保留的 GTEx 样本上的平均 F1 得分为 96.1%。使用 SHAP 值,我们确定了 2423 个最具区分性的基因,其中 98.6%也通过所有组织的差异表达分析来识别。SHAP 基因反映了与组织分化和功能相关的预期生物学过程。此外,与所有基因、差异表达分析检测到的基因或随机基因相比,SHAP 基因在聚类组织类型方面表现出色。我们展示了 SHAP 用于解释深度学习模型的实用性和可靠性,并强调了将 ML 应用于转录组数据的优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0aa/7846764/75cb26df1962/41598_2021_81773_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0aa/7846764/262c62b11748/41598_2021_81773_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0aa/7846764/1b76e63c3320/41598_2021_81773_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0aa/7846764/a2c60fa97f98/41598_2021_81773_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0aa/7846764/75cb26df1962/41598_2021_81773_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0aa/7846764/262c62b11748/41598_2021_81773_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0aa/7846764/1b76e63c3320/41598_2021_81773_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0aa/7846764/a2c60fa97f98/41598_2021_81773_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0aa/7846764/75cb26df1962/41598_2021_81773_Fig4_HTML.jpg

相似文献

1
Verifying explainability of a deep learning tissue classifier trained on RNA-seq data.验证基于 RNA-seq 数据训练的深度学习组织分类器的可解释性。
Sci Rep. 2021 Jan 29;11(1):2641. doi: 10.1038/s41598-021-81773-9.
2
Molecular Classification and Interpretation of Amyotrophic Lateral Sclerosis Using Deep Convolution Neural Networks and Shapley Values.使用深度卷积神经网络和 Shapley 值对肌萎缩侧索硬化症进行分子分类和解释。
Genes (Basel). 2021 Oct 30;12(11):1754. doi: 10.3390/genes12111754.
3
Interpretation of machine learning models using shapley values: application to compound potency and multi-target activity predictions.使用 Shapley 值解释机器学习模型:在化合物效力和多靶点活性预测中的应用。
J Comput Aided Mol Des. 2020 Oct;34(10):1013-1026. doi: 10.1007/s10822-020-00314-0. Epub 2020 May 2.
4
DEGnext: classification of differentially expressed genes from RNA-seq data using a convolutional neural network with transfer learning.DEGnext:使用具有迁移学习的卷积神经网络对 RNA-seq 数据进行差异表达基因分类。
BMC Bioinformatics. 2022 Jan 6;23(1):17. doi: 10.1186/s12859-021-04527-4.
5
Deep learning for liver tumor diagnosis part II: convolutional neural network interpretation using radiologic imaging features.深度学习在肝脏肿瘤诊断中的应用 Ⅱ:利用影像学特征进行卷积神经网络解释。
Eur Radiol. 2019 Jul;29(7):3348-3357. doi: 10.1007/s00330-019-06214-8. Epub 2019 May 15.
6
Generating bulk RNA-Seq gene expression data based on generative deep learning models and utilizing it for data augmentation.基于生成式深度学习模型生成批量 RNA-Seq 基因表达数据,并利用其进行数据增强。
Comput Biol Med. 2024 Feb;169:107828. doi: 10.1016/j.compbiomed.2023.107828. Epub 2023 Dec 7.
7
Stable feature selection utilizing Graph Convolutional Neural Network and Layer-wise Relevance Propagation for biomarker discovery in breast cancer.利用图卷积神经网络和逐层相关性传播进行稳定特征选择,以发现乳腺癌的生物标志物。
Artif Intell Med. 2024 May;151:102840. doi: 10.1016/j.artmed.2024.102840. Epub 2024 Mar 11.
8
Interpretable AI for bio-medical applications.用于生物医学应用的可解释人工智能。
Complex Eng Syst. 2022 Dec;2(4). doi: 10.20517/ces.2022.41. Epub 2022 Dec 28.
9
Convolutional Embedded Networks for Population Scale Clustering and Bio-Ancestry Inferencing.卷积嵌入网络在群体规模聚类和生物亲缘推断中的应用。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Jan-Feb;19(1):369-382. doi: 10.1109/TCBB.2020.2994649. Epub 2022 Feb 3.
10
Explaining multivariate molecular diagnostic tests via Shapley values.通过 Shapley 值解释多变量分子诊断测试。
BMC Med Inform Decis Mak. 2021 Jul 8;21(1):211. doi: 10.1186/s12911-021-01569-9.

引用本文的文献

1
The Breast Cancer Classifier refines molecular breast cancer classification to delineate the HER2-low subtype.乳腺癌分类器优化了分子乳腺癌分类,以界定HER2低表达亚型。
NPJ Breast Cancer. 2025 Feb 20;11(1):19. doi: 10.1038/s41523-025-00723-0.
2
An endoscopic ultrasound-based interpretable deep learning model and nomogram for distinguishing pancreatic neuroendocrine tumors from pancreatic cancer.一种基于内镜超声的可解释深度学习模型和列线图,用于区分胰腺神经内分泌肿瘤和胰腺癌。
Sci Rep. 2025 Jan 27;15(1):3383. doi: 10.1038/s41598-024-84749-7.
3
Disease progression associated cytokines in COVID-19 patients with deteriorating and recovering health conditions.

本文引用的文献

1
Deep learning decodes the principles of differential gene expression.深度学习解码差异基因表达的原理。
Nat Mach Intell. 2020 Jul;2(7):376-386. doi: 10.1038/s42256-020-0201-6. Epub 2020 Jul 6.
2
From Local Explanations to Global Understanding with Explainable AI for Trees.利用可解释人工智能实现从局部解释到树木的全局理解
Nat Mach Intell. 2020 Jan;2(1):56-67. doi: 10.1038/s42256-019-0138-9. Epub 2020 Jan 17.
3
Integrating genomic features for non-invasive early lung cancer detection.整合基因组特征进行非侵入性早期肺癌检测。
与 COVID-19 患者病情恶化和康复相关的疾病进展相关细胞因子。
Sci Rep. 2024 Oct 21;14(1):24712. doi: 10.1038/s41598-024-75924-x.
4
Leveraging explainable deep learning methodologies to elucidate the biological underpinnings of Huntington's disease using single-cell RNA sequencing data.利用可解释的深度学习方法,利用单细胞 RNA 测序数据阐明亨廷顿病的生物学基础。
BMC Genomics. 2024 Oct 4;25(1):930. doi: 10.1186/s12864-024-10855-5.
5
Designing interpretable deep learning applications for functional genomics: a quantitative analysis.设计可解释的深度学习应用于功能基因组学:一项定量分析。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae449.
6
Deep learning for predicting 16S rRNA gene copy number.深度学习预测 16S rRNA 基因拷贝数。
Sci Rep. 2024 Jun 20;14(1):14282. doi: 10.1038/s41598-024-64658-5.
7
Deep learning in cancer genomics and histopathology.深度学习在癌症基因组学和组织病理学中的应用。
Genome Med. 2024 Mar 27;16(1):44. doi: 10.1186/s13073-024-01315-6.
8
Performance of tumour microenvironment deconvolution methods in breast cancer using single-cell simulated bulk mixtures.使用单细胞模拟的肿瘤微环境去卷积方法在乳腺癌中的性能。
Nat Commun. 2023 Sep 16;14(1):5758. doi: 10.1038/s41467-023-41385-5.
9
Machine learning classifier approaches for predicting response to RTK-type-III inhibitors demonstrate high accuracy using transcriptomic signatures and data.使用转录组特征和数据预测对RTK-III型抑制剂反应的机器学习分类器方法显示出高准确性。
Bioinform Adv. 2023 Mar 22;3(1):vbad034. doi: 10.1093/bioadv/vbad034. eCollection 2023.
10
Comparative analysis of tissue-specific genes in maize based on machine learning models: CNN performs technically best, LightGBM performs biologically soundest.基于机器学习模型的玉米组织特异性基因比较分析:卷积神经网络(CNN)在技术上表现最佳,轻梯度提升机(LightGBM)在生物学上表现最合理。
Front Genet. 2023 May 9;14:1190887. doi: 10.3389/fgene.2023.1190887. eCollection 2023.
Nature. 2020 Apr;580(7802):245-251. doi: 10.1038/s41586-020-2140-0. Epub 2020 Mar 25.
4
Clinical, Conventional CT and Radiomic Feature-Based Machine Learning Models for Predicting ALK Rearrangement Status in Lung Adenocarcinoma Patients.基于临床、传统CT和影像组学特征的机器学习模型预测肺腺癌患者ALK重排状态
Front Oncol. 2020 Mar 20;10:369. doi: 10.3389/fonc.2020.00369. eCollection 2020.
5
Classification of pulmonary lesion based on multiparametric MRI: utility of radiomics and comparison of machine learning methods.基于多参数 MRI 的肺部病变分类:放射组学的效用及机器学习方法的比较。
Eur Radiol. 2020 Aug;30(8):4595-4605. doi: 10.1007/s00330-020-06768-y. Epub 2020 Mar 28.
6
Identification of Non-Small Cell Lung Cancer Sensitive to Systemic Cancer Therapies Using Radiomics.基于放射组学识别对全身系统治疗敏感的非小细胞肺癌。
Clin Cancer Res. 2020 May 1;26(9):2151-2162. doi: 10.1158/1078-0432.CCR-19-2942. Epub 2020 Mar 20.
7
MethylNet: an automated and modular deep learning approach for DNA methylation analysis.MethylNet:一种用于 DNA 甲基化分析的自动化和模块化深度学习方法。
BMC Bioinformatics. 2020 Mar 17;21(1):108. doi: 10.1186/s12859-020-3443-8.
8
Explainable Deep Learning for Augmentation of Small RNA Expression Profiles.用于增强小RNA表达谱的可解释深度学习
J Comput Biol. 2020 Feb;27(2):234-247. doi: 10.1089/cmb.2019.0320. Epub 2019 Dec 18.
9
STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets.STRING v11:具有增强覆盖范围的蛋白质-蛋白质相互作用网络,支持在全基因组实验数据集的功能发现。
Nucleic Acids Res. 2019 Jan 8;47(D1):D607-D613. doi: 10.1093/nar/gky1131.
10
Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists.人机大战:深度学习卷积神经网络与 58 位皮肤科医生诊断黑色素瘤皮肤镜图像的对比研究
Ann Oncol. 2018 Aug 1;29(8):1836-1842. doi: 10.1093/annonc/mdy166.