Suppr超能文献

用于生物测序数据的固有可解释位置感知卷积基元核网络。

Inherently interpretable position-aware convolutional motif kernel networks for biological sequencing data.

机构信息

Methods in Medical Informatics, Department of Computer Science, University of Tübingen, Sand 14, Tübingen, 72076, Germany.

出版信息

Sci Rep. 2023 Oct 11;13(1):17216. doi: 10.1038/s41598-023-44175-7.

Abstract

Artificial neural networks show promising performance in detecting correlations within data that are associated with specific outcomes. However, the black-box nature of such models can hinder the knowledge advancement in research fields by obscuring the decision process and preventing scientist to fully conceptualize predicted outcomes. Furthermore, domain experts like healthcare providers need explainable predictions to assess whether a predicted outcome can be trusted in high stakes scenarios and to help them integrating a model into their own routine. Therefore, interpretable models play a crucial role for the incorporation of machine learning into high stakes scenarios like healthcare. In this paper we introduce Convolutional Motif Kernel Networks, a neural network architecture that involves learning a feature representation within a subspace of the reproducing kernel Hilbert space of the position-aware motif kernel function. The resulting model enables to directly interpret and evaluate prediction outcomes by providing a biologically and medically meaningful explanation without the need for additional post-hoc analysis. We show that our model is able to robustly learn on small datasets and reaches state-of-the-art performance on relevant healthcare prediction tasks. Our proposed method can be utilized on DNA and protein sequences. Furthermore, we show that the proposed method learns biologically meaningful concepts directly from data using an end-to-end learning scheme.

摘要

人工神经网络在检测与特定结果相关的数据内的相关性方面表现出了很有前景的性能。然而,此类模型的黑盒性质可能会阻碍研究领域的知识进步,因为它掩盖了决策过程,使科学家无法完全构想预测结果。此外,医疗保健提供者等领域专家需要可解释的预测结果来评估在高风险情况下预测结果是否可信,并帮助他们将模型融入自己的常规工作中。因此,可解释的模型对于将机器学习应用于医疗保健等高风险场景至关重要。在本文中,我们介绍了卷积模核网络,这是一种神经网络架构,它涉及在位置感知模核函数的再生核希尔伯特空间的子空间中学习特征表示。由此产生的模型能够通过提供具有生物学和医学意义的解释,直接解释和评估预测结果,而无需额外的事后分析。我们表明,我们的模型能够在小数据集上稳健地学习,并在相关的医疗保健预测任务上达到最先进的性能。我们提出的方法可以用于 DNA 和蛋白质序列。此外,我们表明,该方法使用端到端学习方案直接从数据中学习有生物学意义的概念。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ccd2/10567796/16bf2e783f3e/41598_2023_44175_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验