• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过学习随机掩码来解释生物序列的神经网络。

Interpreting Neural Networks for Biological Sequences by Learning Stochastic Masks.

作者信息

Linder Johannes, La Fleur Alyssa, Chen Zibo, Ljubeti Ajasja, Baker David, Kannan Sreeram, Seelig Georg

机构信息

Paul G. Allen School of Computer Science and Engineering, University of Washington.

Institute for Protein Design, University of Washington.

出版信息

Nat Mach Intell. 2022 Jan;4(1):41-54. doi: 10.1038/s42256-021-00428-6. Epub 2022 Jan 25.

DOI:10.1038/s42256-021-00428-6
PMID:35966405
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9373874/
Abstract

Sequence-based neural networks can learn to make accurate predictions from large biological datasets, but model interpretation remains challenging. Many existing feature attribution methods are optimized for continuous rather than discrete input patterns and assess individual feature importance in isolation, making them ill-suited for interpreting non-linear interactions in molecular sequences. Building on work in computer vision and natural language processing, we developed an approach based on deep learning - Scrambler networks - wherein the most salient sequence positions are identified with learned input masks. Scramblers learn to predict Position-Specific Scoring Matrices () where unimportant nucleotides or residues are scrambled by raising their entropy. We apply Scramblers to interpret the effects of genetic variants, uncover non-linear interactions between cis-regulatory elements, explain binding specificity for protein-protein interactions, and identify structural determinants of designed proteins. We show that Scramblers enable efficient attribution across large datasets and result in high-quality explanations, often outperforming state-of-the-art methods.

摘要

基于序列的神经网络可以从大型生物数据集中学习进行准确预测,但模型解释仍然具有挑战性。许多现有的特征归因方法是针对连续而非离散输入模式进行优化的,并且孤立地评估单个特征的重要性,这使得它们不适用于解释分子序列中的非线性相互作用。基于计算机视觉和自然语言处理的工作,我们开发了一种基于深度学习的方法——加扰网络,其中通过学习到的输入掩码来识别最显著的序列位置。加扰网络学习预测特定位置评分矩阵(PSSM),其中不重要的核苷酸或残基通过提高其熵进行加扰。我们应用加扰网络来解释基因变异的影响,揭示顺式调控元件之间的非线性相互作用,解释蛋白质-蛋白质相互作用的结合特异性,并识别设计蛋白质的结构决定因素。我们表明,加扰网络能够在大型数据集中进行高效归因,并产生高质量的解释,通常优于现有最先进的方法。

相似文献

1
Interpreting Neural Networks for Biological Sequences by Learning Stochastic Masks.通过学习随机掩码来解释生物序列的神经网络。
Nat Mach Intell. 2022 Jan;4(1):41-54. doi: 10.1038/s42256-021-00428-6. Epub 2022 Jan 25.
2
Interpreting a recurrent neural network's predictions of ICU mortality risk.解读 循环神经网络对 ICU 死亡率风险预测。
J Biomed Inform. 2021 Feb;114:103672. doi: 10.1016/j.jbi.2021.103672. Epub 2021 Jan 7.
3
Interpreting cis-regulatory interactions from large-scale deep neural networks.从大规模深度神经网络中解读顺式调控相互作用。
Nat Genet. 2024 Nov;56(11):2517-2527. doi: 10.1038/s41588-024-01923-3. Epub 2024 Sep 16.
4
Interpreting Neural Network Models for Toxicity Prediction by Extracting Learned Chemical Features.通过提取学习到的化学特征来解释神经网络模型在毒性预测中的作用。
J Chem Inf Model. 2024 May 13;64(9):3670-3688. doi: 10.1021/acs.jcim.4c00127. Epub 2024 Apr 30.
5
Neural networks with circular filters enable data efficient inference of sequence motifs.具有循环滤波器的神经网络能够实现对序列基序的数据高效推断。
Bioinformatics. 2019 Oct 15;35(20):3937-3943. doi: 10.1093/bioinformatics/btz194.
6
GAT-LI: a graph attention network based learning and interpreting method for functional brain network classification.GAT-LI:一种基于图注意力网络的学习和解释方法,用于功能脑网络分类。
BMC Bioinformatics. 2021 Jul 22;22(1):379. doi: 10.1186/s12859-021-04295-1.
7
Maximum entropy methods for extracting the learned features of deep neural networks.用于提取深度神经网络学习特征的最大熵方法。
PLoS Comput Biol. 2017 Oct 30;13(10):e1005836. doi: 10.1371/journal.pcbi.1005836. eCollection 2017 Oct.
8
Correcting gradient-based interpretations of deep neural networks for genomics.纠正基于梯度的深度学习神经网络在基因组学中的解释。
Genome Biol. 2023 May 9;24(1):109. doi: 10.1186/s13059-023-02956-3.
9
ResidualBind: Uncovering Sequence-Structure Preferences of RNA-Binding Proteins with Deep Neural Networks.ResidualBind:利用深度神经网络揭示 RNA 结合蛋白的序列-结构偏好。
Methods Mol Biol. 2023;2586:197-215. doi: 10.1007/978-1-0716-2768-6_12.
10
Novel Transformer Networks for Improved Sequence Labeling in genomics.用于改善基因组学中序列标记的新型 Transformer 网络。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Jan-Feb;19(1):97-106. doi: 10.1109/TCBB.2020.3035021. Epub 2022 Feb 3.

引用本文的文献

1
Definer: A computational method for accurate identification of RNA pseudouridine sites based on deep learning.定义者:一种基于深度学习的准确识别RNA假尿苷位点的计算方法。
PLoS One. 2025 Apr 24;20(4):e0320077. doi: 10.1371/journal.pone.0320077. eCollection 2025.
2
Decoding biology with massively parallel reporter assays and machine learning.利用大规模平行报告基因检测和机器学习解码生物学。
Genes Dev. 2024 Oct 16;38(17-20):843-865. doi: 10.1101/gad.351800.124.
3
Interpreting cis-regulatory interactions from large-scale deep neural networks.从大规模深度神经网络中解读顺式调控相互作用。
Nat Genet. 2024 Nov;56(11):2517-2527. doi: 10.1038/s41588-024-01923-3. Epub 2024 Sep 16.
4
Enhancing missense variant pathogenicity prediction with protein language models using VariPred.利用 VariPred 利用蛋白质语言模型增强错义变异致病性预测。
Sci Rep. 2024 Apr 7;14(1):8136. doi: 10.1038/s41598-024-51489-7.
5
Functional annotation of enzyme-encoding genes using deep learning with transformer layers.利用带有转换器层的深度学习对酶编码基因进行功能注释。
Nat Commun. 2023 Nov 14;14(1):7370. doi: 10.1038/s41467-023-43216-z.
6
Interpreting -Regulatory Interactions from Large-Scale Deep Neural Networks for Genomics.从用于基因组学的大规模深度神经网络中解读调控相互作用。
bioRxiv. 2024 Mar 20:2023.07.03.547592. doi: 10.1101/2023.07.03.547592.
7
Recent Advances in Deep Learning for Protein-Protein Interaction Analysis: A Comprehensive Review.深度学习在蛋白质-蛋白质相互作用分析中的最新进展:全面综述。
Molecules. 2023 Jul 2;28(13):5169. doi: 10.3390/molecules28135169.
8
Deciphering the impact of genetic variation on human polyadenylation using APARENT2.利用 APARENT2 破译遗传变异对人类多聚腺苷酸化的影响。
Genome Biol. 2022 Nov 5;23(1):232. doi: 10.1186/s13059-022-02799-4.
9
Genomics enters the deep learning era.基因组学进入深度学习时代。
PeerJ. 2022 Jun 24;10:e13613. doi: 10.7717/peerj.13613. eCollection 2022.

本文引用的文献

1
De novo protein design by deep network hallucination.基于深度网络幻觉的从头设计蛋白质。
Nature. 2021 Dec;600(7889):547-552. doi: 10.1038/s41586-021-04184-w. Epub 2021 Dec 1.
2
Fast activation maximization for molecular sequence design.快速激活最大化的分子序列设计。
BMC Bioinformatics. 2021 Oct 20;22(1):510. doi: 10.1186/s12859-021-04437-5.
3
Predicting enhancer-promoter interaction from genomic sequence with deep neural networks.利用深度神经网络从基因组序列预测增强子-启动子相互作用。
Quant Biol. 2019 Jun;7(2):122-137. doi: 10.1007/s40484-019-0154-0.
4
Interpretation of deep learning in genomics and epigenomics.深度学习在基因组学和表观基因组学中的应用。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa177.
5
Protein sequence design by conformational landscape optimization.通过构象景观优化进行蛋白质序列设计。
Proc Natl Acad Sci U S A. 2021 Mar 16;118(11). doi: 10.1073/pnas.2017228118.
6
DeeReCT-APA: Prediction of Alternative Polyadenylation Site Usage Through Deep Learning.DeeReCT-APA:通过深度学习预测可变聚腺苷酸化位点的使用情况
Genomics Proteomics Bioinformatics. 2022 Jun;20(3):483-495. doi: 10.1016/j.gpb.2020.05.004. Epub 2021 Mar 2.
7
Base-resolution models of transcription-factor binding reveal soft motif syntax.基于分辨率的转录因子结合模型揭示了软基序语法。
Nat Genet. 2021 Mar;53(3):354-366. doi: 10.1038/s41588-021-00782-6. Epub 2021 Feb 18.
8
Characterising the loss-of-function impact of 5' untranslated region variants in 15,708 individuals.鉴定 15708 个人的 5'非翻译区变异的功能丧失影响。
Nat Commun. 2020 May 27;11(1):2523. doi: 10.1038/s41467-019-10717-9.
9
Improved protein structure prediction using potentials from deep learning.利用深度学习势进行蛋白质结构预测的改进。
Nature. 2020 Jan;577(7792):706-710. doi: 10.1038/s41586-019-1923-7. Epub 2020 Jan 15.
10
Improved protein structure prediction using predicted interresidue orientations.利用预测的残基间取向改进蛋白质结构预测。
Proc Natl Acad Sci U S A. 2020 Jan 21;117(3):1496-1503. doi: 10.1073/pnas.1914677117. Epub 2020 Jan 2.