Suppr
超能文献

用于生物测序数据的固有可解释位置感知卷积基元核网络。

Inherently interpretable position-aware convolutional motif kernel networks for biological sequencing data.

机构信息

Methods in Medical Informatics, Department of Computer Science, University of Tübingen, Sand 14, Tübingen, 72076, Germany.

出版信息

Sci Rep. 2023 Oct 11;13(1):17216. doi: 10.1038/s41598-023-44175-7.

DOI:10.1038/s41598-023-44175-7

PMID:37821530

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10567796/

Abstract

Artificial neural networks show promising performance in detecting correlations within data that are associated with specific outcomes. However, the black-box nature of such models can hinder the knowledge advancement in research fields by obscuring the decision process and preventing scientist to fully conceptualize predicted outcomes. Furthermore, domain experts like healthcare providers need explainable predictions to assess whether a predicted outcome can be trusted in high stakes scenarios and to help them integrating a model into their own routine. Therefore, interpretable models play a crucial role for the incorporation of machine learning into high stakes scenarios like healthcare. In this paper we introduce Convolutional Motif Kernel Networks, a neural network architecture that involves learning a feature representation within a subspace of the reproducing kernel Hilbert space of the position-aware motif kernel function. The resulting model enables to directly interpret and evaluate prediction outcomes by providing a biologically and medically meaningful explanation without the need for additional post-hoc analysis. We show that our model is able to robustly learn on small datasets and reaches state-of-the-art performance on relevant healthcare prediction tasks. Our proposed method can be utilized on DNA and protein sequences. Furthermore, we show that the proposed method learns biologically meaningful concepts directly from data using an end-to-end learning scheme.

摘要

人工神经网络在检测与特定结果相关的数据内的相关性方面表现出了很有前景的性能。然而，此类模型的黑盒性质可能会阻碍研究领域的知识进步，因为它掩盖了决策过程，使科学家无法完全构想预测结果。此外，医疗保健提供者等领域专家需要可解释的预测结果来评估在高风险情况下预测结果是否可信，并帮助他们将模型融入自己的常规工作中。因此，可解释的模型对于将机器学习应用于医疗保健等高风险场景至关重要。在本文中，我们介绍了卷积模核网络，这是一种神经网络架构，它涉及在位置感知模核函数的再生核希尔伯特空间的子空间中学习特征表示。由此产生的模型能够通过提供具有生物学和医学意义的解释，直接解释和评估预测结果，而无需额外的事后分析。我们表明，我们的模型能够在小数据集上稳健地学习，并在相关的医疗保健预测任务上达到最先进的性能。我们提出的方法可以用于 DNA 和蛋白质序列。此外，我们表明，该方法使用端到端学习方案直接从数据中学习有生物学意义的概念。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ccd2/10567796/16bf2e783f3e/41598_2023_44175_Fig1_HTML.jpg

相似文献

Inherently interpretable position-aware convolutional motif kernel networks for biological sequencing data.

Sci Rep. 2023 Oct 11;13(1):17216. doi: 10.1038/s41598-023-44175-7.

COmic: convolutional kernel networks for interpretable end-to-end learning on (multi-)omics data.

Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i76-i85. doi: 10.1093/bioinformatics/btad204.

Interpretable machine learning models for hospital readmission prediction: a two-step extracted regression tree approach.

BMC Med Inform Decis Mak. 2023 Jun 5;23(1):104. doi: 10.1186/s12911-023-02193-5.

Learning active subspaces and discovering important features with Gaussian radial basis functions neural networks.

Neural Netw. 2024 Aug;176:106335. doi: 10.1016/j.neunet.2024.106335. Epub 2024 Apr 29.

RNA-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach.

BMC Bioinformatics. 2017 Feb 28;18(1):136. doi: 10.1186/s12859-017-1561-8.

CEFEs: A CNN Explainable Framework for ECG Signals.

Artif Intell Med. 2021 May;115:102059. doi: 10.1016/j.artmed.2021.102059. Epub 2021 Mar 26.

Explainable Machine Learning Framework for Image Classification Problems: Case Study on Glioma Cancer Prediction.

J Imaging. 2020 May 28;6(6):37. doi: 10.3390/jimaging6060037.

Biological sequence modeling with convolutional kernel networks.

Bioinformatics. 2019 Sep 15;35(18):3294-3302. doi: 10.1093/bioinformatics/btz094.

Interpretability and Optimisation of Convolutional Neural Networks Based on Sinc-Convolution.

IEEE J Biomed Health Inform. 2023 Apr;27(4):1758-1769. doi: 10.1109/JBHI.2022.3185290. Epub 2023 Apr 4.

Development and validation of an interpretable 3 day intensive care unit readmission prediction model using explainable boosting machines.

Front Med (Lausanne). 2022 Aug 23;9:960296. doi: 10.3389/fmed.2022.960296. eCollection 2022.

引用本文的文献

HIV multidrug class resistance prediction with a time sliding anchor approach.

Bioinform Adv. 2025 May 15;5(1):vbaf099. doi: 10.1093/bioadv/vbaf099. eCollection 2025.

本文引用的文献

Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead.

Nat Mach Intell. 2019 May;1(5):206-215. doi: 10.1038/s42256-019-0048-x. Epub 2019 May 13.

Biological sequence modeling with convolutional kernel networks.

Bioinformatics. 2019 Sep 15;35(18):3294-3302. doi: 10.1093/bioinformatics/btz094.

SpliceRover: interpretable convolutional neural networks for improved splice site prediction.

Bioinformatics. 2018 Dec 15;34(24):4180-4188. doi: 10.1093/bioinformatics/bty497.

geno2pheno[ngs-freq]: a genotypic interpretation system for identifying viral drug resistance using next-generation sequencing data.

Nucleic Acids Res. 2018 Jul 2;46(W1):W271-W277. doi: 10.1093/nar/gky349.

Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning.

Nat Biotechnol. 2015 Aug;33(8):831-8. doi: 10.1038/nbt.3300. Epub 2015 Jul 27.

Short communication: Phenotypic protease inhibitor resistance and cross-resistance in the clinic from 2006 to 2008 and mutational prevalences in HIV from patients with discordant tipranavir and darunavir susceptibility phenotypes.

AIDS Res Hum Retroviruses. 2012 Sep;28(9):1019-24. doi: 10.1089/AID.2011.0242. Epub 2012 Mar 23.

HIV-1 protease mutations and protease inhibitor cross-resistance.

Antimicrob Agents Chemother. 2010 Oct;54(10):4253-61. doi: 10.1128/AAC.00574-10. Epub 2010 Jul 26.

The impact of individual human immunodeficiency virus type 1 protease mutations on drug susceptibility is highly influenced by complex interactions with the background protease sequence.

J Virol. 2009 Sep;83(18):9512-20. doi: 10.1128/JVI.00291-09. Epub 2009 Jul 8.

Rationale and uses of a public HIV drug-resistance database.

J Infect Dis. 2006 Sep 15;194 Suppl 1(Suppl 1):S51-8. doi: 10.1086/505356.

Prediction of splice sites with dependency graphs and their expanded bayesian networks.

Bioinformatics. 2005 Feb 15;21(4):471-82. doi: 10.1093/bioinformatics/bti025. Epub 2004 Sep 16.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

用于生物测序数据的固有可解释位置感知卷积基元核网络。

Inherently interpretable position-aware convolutional motif kernel networks for biological sequencing data.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译