• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用可解释的机器学习破解 RNA 剪接逻辑。

Deciphering RNA splicing logic with interpretable machine learning.

机构信息

Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, New York, NY 10012.

出版信息

Proc Natl Acad Sci U S A. 2023 Oct 10;120(41):e2221165120. doi: 10.1073/pnas.2221165120. Epub 2023 Oct 5.

DOI:10.1073/pnas.2221165120
PMID:37796983
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10576025/
Abstract

Machine learning methods, particularly neural networks trained on large datasets, are transforming how scientists approach scientific discovery and experimental design. However, current state-of-the-art neural networks are limited by their uninterpretability: Despite their excellent accuracy, they cannot describe how they arrived at their predictions. Here, using an "interpretable-by-design" approach, we present a neural network model that provides insights into RNA splicing, a fundamental process in the transfer of genomic information into functional biochemical products. Although we designed our model to emphasize interpretability, its predictive accuracy is on par with state-of-the-art models. To demonstrate the model's interpretability, we introduce a visualization that, for any given exon, allows us to trace and quantify the entire decision process from input sequence to output splicing prediction. Importantly, the model revealed uncharacterized components of the splicing logic, which we experimentally validated. This study highlights how interpretable machine learning can advance scientific discovery.

摘要

机器学习方法,特别是在大型数据集上训练的神经网络,正在改变科学家们进行科学发现和实验设计的方式。然而,当前最先进的神经网络受到其不可解释性的限制:尽管它们具有出色的准确性,但它们无法描述它们是如何得出预测结果的。在这里,我们采用一种“设计可解释性”的方法,提出了一种神经网络模型,该模型提供了对 RNA 剪接的深入了解,RNA 剪接是将基因组信息转化为功能性生化产物的基本过程。尽管我们设计模型强调可解释性,但它的预测准确性与最先进的模型相当。为了展示模型的可解释性,我们引入了一种可视化方法,对于任何给定的外显子,我们都可以追踪并量化从输入序列到输出剪接预测的整个决策过程。重要的是,该模型揭示了剪接逻辑中未被表征的组成部分,我们通过实验进行了验证。这项研究强调了可解释机器学习如何促进科学发现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0899/10576025/c929cd6f84dc/pnas.2221165120fig05.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0899/10576025/f6b2cc484319/pnas.2221165120fig01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0899/10576025/054dad1b1c97/pnas.2221165120fig02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0899/10576025/283c79625260/pnas.2221165120fig03.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0899/10576025/1c8b20b161dc/pnas.2221165120fig04.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0899/10576025/c929cd6f84dc/pnas.2221165120fig05.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0899/10576025/f6b2cc484319/pnas.2221165120fig01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0899/10576025/054dad1b1c97/pnas.2221165120fig02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0899/10576025/283c79625260/pnas.2221165120fig03.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0899/10576025/1c8b20b161dc/pnas.2221165120fig04.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0899/10576025/c929cd6f84dc/pnas.2221165120fig05.jpg

相似文献

1
Deciphering RNA splicing logic with interpretable machine learning.用可解释的机器学习破解 RNA 剪接逻辑。
Proc Natl Acad Sci U S A. 2023 Oct 10;120(41):e2221165120. doi: 10.1073/pnas.2221165120. Epub 2023 Oct 5.
2
Interpretable machine learning models for hospital readmission prediction: a two-step extracted regression tree approach.可解释的机器学习模型在医院再入院预测中的应用:一种两步提取回归树方法。
BMC Med Inform Decis Mak. 2023 Jun 5;23(1):104. doi: 10.1186/s12911-023-02193-5.
3
SpliceRover: interpretable convolutional neural networks for improved splice site prediction.SpliceRover:用于提高剪接位点预测的可解释卷积神经网络。
Bioinformatics. 2018 Dec 15;34(24):4180-4188. doi: 10.1093/bioinformatics/bty497.
4
Interpretable neural architecture search and transfer learning for understanding CRISPR-Cas9 off-target enzymatic reactions.用于理解CRISPR-Cas9脱靶酶促反应的可解释神经架构搜索与迁移学习
Nat Comput Sci. 2023 Dec;3(12):1056-1066. doi: 10.1038/s43588-023-00569-1. Epub 2023 Dec 14.
5
ExplaiNN: interpretable and transparent neural networks for genomics.ExplaiNN:基因组学的可解释和透明神经网络。
Genome Biol. 2023 Jun 27;24(1):154. doi: 10.1186/s13059-023-02985-y.
6
Integrating machine learning with pharmacokinetic models: Benefits of scientific machine learning in adding neural networks components to existing PK models.将机器学习与药代动力学模型相结合:科学机器学习在为现有 PK 模型添加神经网络组件方面的优势。
CPT Pharmacometrics Syst Pharmacol. 2024 Jan;13(1):41-53. doi: 10.1002/psp4.13054. Epub 2023 Oct 16.
7
A machine learning framework for interpretable predictions in patient pathways: The case of predicting ICU admission for patients with symptoms of sepsis.用于患者路径中可解释预测的机器学习框架:以预测脓毒症症状患者 ICU 入院为例。
Health Care Manag Sci. 2024 Jun;27(2):136-167. doi: 10.1007/s10729-024-09673-8. Epub 2024 May 21.
8
Medical recommender systems based on continuous-valued logic and multi-criteria decision operators, using interpretable neural networks.基于连续值逻辑和多准则决策算子的可解释神经网络医疗推荐系统。
BMC Med Inform Decis Mak. 2021 Jun 11;21(1):186. doi: 10.1186/s12911-021-01553-3.
9
CI-SpliceAI-Improving machine learning predictions of disease causing splicing variants using curated alternative splice sites.CI-SpliceAI-利用已注释的可变剪接位点来改进疾病相关剪接变异体的机器学习预测。
PLoS One. 2022 Jun 3;17(6):e0269159. doi: 10.1371/journal.pone.0269159. eCollection 2022.
10
Illuminating the Neural Landscape of Pilot Mental States: A Convolutional Neural Network Approach with Shapley Additive Explanations Interpretability.揭示飞行员心理状态的神经景观:基于 Shapley 加法解释可解释性的卷积神经网络方法。
Sensors (Basel). 2023 Nov 8;23(22):9052. doi: 10.3390/s23229052.

引用本文的文献

1
: an R package to infer gene transcription rates with a novel least sum of squares method.一个用于通过一种新颖的最小二乘和方法推断基因转录率的R包。
NAR Genom Bioinform. 2025 Sep 5;7(3):lqaf123. doi: 10.1093/nargab/lqaf123. eCollection 2025 Sep.
2
Learning sequence-function relationships with scalable, interpretable Gaussian processes.通过可扩展、可解释的高斯过程学习序列-函数关系。
bioRxiv. 2025 Aug 19:2025.08.15.670613. doi: 10.1101/2025.08.15.670613.
3
Massive experimental quantification allows interpretable deep learning of protein aggregation.

本文引用的文献

1
Sequence determinant of small RNA production by DICER.Dicer产生小RNA的序列决定因素。
Nature. 2023 Mar;615(7951):323-330. doi: 10.1038/s41586-023-05722-4. Epub 2023 Feb 22.
2
Obtaining genetics insights from deep learning via explainable artificial intelligence.通过可解释人工智能从深度学习中获取遗传学见解。
Nat Rev Genet. 2023 Feb;24(2):125-137. doi: 10.1038/s41576-022-00532-2. Epub 2022 Oct 3.
3
Deep learning modeling mA deposition reveals the importance of downstream cis-element sequences.深度学习建模 mA 沉积揭示了下游顺式元件序列的重要性。
大规模实验量化实现了对蛋白质聚集的可解释深度学习。
Sci Adv. 2025 May 2;11(18):eadt5111. doi: 10.1126/sciadv.adt5111. Epub 2025 Apr 30.
4
The regulatory landscape of 5' UTRs in translational control during zebrafish embryogenesis.斑马鱼胚胎发育过程中5'非翻译区在翻译调控中的调控格局。
Dev Cell. 2025 May 19;60(10):1498-1515.e8. doi: 10.1016/j.devcel.2024.12.038. Epub 2025 Jan 15.
5
Dynamics of RNA localization to nuclear speckles are connected to splicing efficiency.RNA 定位到核斑点的动力学与剪接效率有关。
Sci Adv. 2024 Oct 18;10(42):eadp7727. doi: 10.1126/sciadv.adp7727. Epub 2024 Oct 16.
6
Decoding biology with massively parallel reporter assays and machine learning.利用大规模平行报告基因检测和机器学习解码生物学。
Genes Dev. 2024 Oct 16;38(17-20):843-865. doi: 10.1101/gad.351800.124.
7
From computational models of the splicing code to regulatory mechanisms and therapeutic implications.从剪接密码的计算模型到调控机制及治疗意义
Nat Rev Genet. 2025 Mar;26(3):171-190. doi: 10.1038/s41576-024-00774-2. Epub 2024 Oct 2.
8
Bipolar disorder: Construction and analysis of a joint diagnostic model using random forest and feedforward neural networks.双相情感障碍:使用随机森林和前馈神经网络构建和分析联合诊断模型
IBRO Neurosci Rep. 2024 Jul 31;17:145-153. doi: 10.1016/j.ibneur.2024.07.007. eCollection 2024 Dec.
9
Massive experimental quantification of amyloid nucleation allows interpretable deep learning of protein aggregation.对淀粉样蛋白成核进行大规模实验量化可实现对蛋白质聚集的可解释深度学习。
bioRxiv. 2024 Oct 1:2024.07.13.603366. doi: 10.1101/2024.07.13.603366.
10
Embracing exascale computing in nucleic acid simulations.拥抱核酸模拟中的百亿亿次级计算。
Curr Opin Struct Biol. 2024 Aug;87:102847. doi: 10.1016/j.sbi.2024.102847. Epub 2024 May 29.
Nat Commun. 2022 May 17;13(1):2720. doi: 10.1038/s41467-022-30209-7.
4
Spliceator: multi-species splice site prediction using convolutional neural networks.Spliceator:使用卷积神经网络进行多物种剪接位点预测。
BMC Bioinformatics. 2021 Nov 23;22(1):561. doi: 10.1186/s12859-021-04471-3.
5
Alternative splicing during mammalian organ development.哺乳动物器官发育过程中的可变剪接。
Nat Genet. 2021 Jun;53(6):925-934. doi: 10.1038/s41588-021-00851-w. Epub 2021 May 3.
6
Splicing at the phase-separated nuclear speckle interface: a model.相分离核斑点界面处的剪接:一种模型。
Nucleic Acids Res. 2021 Jan 25;49(2):636-645. doi: 10.1093/nar/gkaa1209.
7
Array programming with NumPy.使用 NumPy 进行数组编程。
Nature. 2020 Sep;585(7825):357-362. doi: 10.1038/s41586-020-2649-2. Epub 2020 Sep 16.
8
Decoding mRNA translatability and stability from the 5' UTR.从 5'UTR 解码 mRNA 的翻译能力和稳定性。
Nat Struct Mol Biol. 2020 Sep;27(9):814-821. doi: 10.1038/s41594-020-0465-x. Epub 2020 Jul 27.
9
SciPy 1.0: fundamental algorithms for scientific computing in Python.SciPy 1.0:Python 中的科学计算基础算法。
Nat Methods. 2020 Mar;17(3):261-272. doi: 10.1038/s41592-019-0686-2. Epub 2020 Feb 3.
10
Deciphering eukaryotic gene-regulatory logic with 100 million random promoters.用 1 亿个随机启动子破译真核基因调控逻辑。
Nat Biotechnol. 2020 Jan;38(1):56-65. doi: 10.1038/s41587-019-0315-8. Epub 2019 Dec 2.