• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用深度卷积神经网络直接从基因组序列预测 mRNA 丰度。

Predicting mRNA Abundance Directly from Genomic Sequence Using Deep Convolutional Neural Networks.

机构信息

Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA; Calico Life Sciences LLC, South San Francisco, CA 94080, USA.

Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA; Howard Hughes Medical Institute, Seattle, WA 98195, USA; Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, WA, USA.

出版信息

Cell Rep. 2020 May 19;31(7):107663. doi: 10.1016/j.celrep.2020.107663.

DOI:10.1016/j.celrep.2020.107663
PMID:32433972
Abstract

Algorithms that accurately predict gene structure from primary sequence alone were transformative for annotating the human genome. Can we also predict the expression levels of genes based solely on genome sequence? Here, we sought to apply deep convolutional neural networks toward that goal. Surprisingly, a model that includes only promoter sequences and features associated with mRNA stability explains 59% and 71% of variation in steady-state mRNA levels in human and mouse, respectively. This model, termed Xpresso, more than doubles the accuracy of alternative sequence-based models and isolates rules as predictive as models relying on chromatic immunoprecipitation sequencing (ChIP-seq) data. Xpresso recapitulates genome-wide patterns of transcriptional activity, and its residuals can be used to quantify the influence of enhancers, heterochromatic domains, and microRNAs. Model interpretation reveals that promoter-proximal CpG dinucleotides strongly predict transcriptional activity. Looking forward, we propose cell-type-specific gene-expression predictions based solely on primary sequences as a grand challenge for the field.

摘要

能够仅根据原始序列准确预测基因结构的算法彻底改变了人类基因组注释方式。我们是否也可以仅基于基因组序列来预测基因的表达水平?在这里,我们试图将深度卷积神经网络应用于该目标。令人惊讶的是,一个仅包含启动子序列和与 mRNA 稳定性相关的特征的模型分别解释了人类和小鼠中稳态 mRNA 水平变化的 59%和 71%。该模型称为 Xpresso,其准确性超过了基于其他序列模型的两倍,并分离出与依赖于染色质免疫沉淀测序 (ChIP-seq) 数据的模型一样具有预测性的规则。Xpresso 再现了转录活性的全基因组模式,其残差可用于量化增强子、异染色质域和 microRNA 的影响。模型解释表明,启动子近端的 CpG 二核苷酸强烈预测转录活性。展望未来,我们提出仅基于原始序列进行细胞类型特异性基因表达预测,作为该领域的一个重大挑战。

相似文献

1
Predicting mRNA Abundance Directly from Genomic Sequence Using Deep Convolutional Neural Networks.利用深度卷积神经网络直接从基因组序列预测 mRNA 丰度。
Cell Rep. 2020 May 19;31(7):107663. doi: 10.1016/j.celrep.2020.107663.
2
MiREx: mRNA levels prediction from gene sequence and miRNA target knowledge.MiREx:基于基因序列和 miRNA 靶知识的 mRNA 水平预测。
BMC Bioinformatics. 2023 Nov 22;24(1):443. doi: 10.1186/s12859-023-05560-1.
3
Predicting gene and protein expression levels from DNA and protein sequences with Perceiver.利用 Perceiver 从 DNA 和蛋白质序列预测基因和蛋白质表达水平。
Comput Methods Programs Biomed. 2023 Jun;234:107504. doi: 10.1016/j.cmpb.2023.107504. Epub 2023 Mar 22.
4
Predicting enhancers with deep convolutional neural networks.使用深度卷积神经网络预测增强子。
BMC Bioinformatics. 2017 Dec 1;18(Suppl 13):478. doi: 10.1186/s12859-017-1878-3.
5
Predicting gene regulatory regions with a convolutional neural network for processing double-strand genome sequence information.利用卷积神经网络处理双链基因组序列信息来预测基因调控区域。
PLoS One. 2020 Jul 23;15(7):e0235748. doi: 10.1371/journal.pone.0235748. eCollection 2020.
6
Integrating distal and proximal information to predict gene expression via a densely connected convolutional neural network.通过密集连接卷积神经网络整合远端和近端信息来预测基因表达。
Bioinformatics. 2020 Jan 15;36(2):496-503. doi: 10.1093/bioinformatics/btz562.
7
MRCNN: a deep learning model for regression of genome-wide DNA methylation.MRCNN:一种用于全基因组 DNA 甲基化回归的深度学习模型。
BMC Genomics. 2019 Apr 4;20(Suppl 2):192. doi: 10.1186/s12864-019-5488-5.
8
Genome-wide prediction of cis-regulatory regions using supervised deep learning methods.基于监督深度学习方法的全基因组顺式调控区预测。
BMC Bioinformatics. 2018 May 31;19(1):202. doi: 10.1186/s12859-018-2187-1.
9
Opening up the blackbox: an interpretable deep neural network-based classifier for cell-type specific enhancer predictions.打开黑箱:一种基于可解释深度神经网络的细胞类型特异性增强子预测分类器。
BMC Syst Biol. 2016 Aug 1;10 Suppl 2(Suppl 2):54. doi: 10.1186/s12918-016-0302-3.
10
Deep learning of the tissue-regulated splicing code.深度学习组织调控的剪接代码。
Bioinformatics. 2014 Jun 15;30(12):i121-9. doi: 10.1093/bioinformatics/btu277.

引用本文的文献

1
Pretraining Improves Prediction of Genomic Datasets Across Species.预训练可提高跨物种基因组数据集的预测能力。
bioRxiv. 2025 Aug 24:2025.08.20.671362. doi: 10.1101/2025.08.20.671362.
2
RNADecayCafe, a uniformly processed atlas of RNA half-life estimates across multiple human cell lines.RNA衰变咖啡馆,一个经过统一处理的、涵盖多种人类细胞系的RNA半衰期估计图谱。
bioRxiv. 2025 Aug 21:2025.08.19.671151. doi: 10.1101/2025.08.19.671151.
3
In silico prediction of variant effects: promises and limitations for precision plant breeding.变异效应的计算机模拟预测:精准植物育种的前景与局限
Theor Appl Genet. 2025 Jul 28;138(8):193. doi: 10.1007/s00122-025-04973-1.
4
Predicting the translation efficiency of messenger RNA in mammalian cells.预测哺乳动物细胞中信使核糖核酸的翻译效率。
Nat Biotechnol. 2025 Jul 25. doi: 10.1038/s41587-025-02712-x.
5
UTRGAN: learning to generate 5' UTR sequences for optimized translation efficiency and gene expression.UTRGAN:学习生成5'非翻译区序列以优化翻译效率和基因表达。
Bioinform Adv. 2025 Jun 10;5(1):vbaf134. doi: 10.1093/bioadv/vbaf134. eCollection 2025.
6
Predicting gene expression from DNA sequence using deep learning models.使用深度学习模型从DNA序列预测基因表达。
Nat Rev Genet. 2025 May 13. doi: 10.1038/s41576-025-00841-2.
7
Rewriting regulatory DNA to dissect and reprogram gene expression.重写调控性DNA以剖析和重新编程基因表达。
Cell. 2025 Apr 14. doi: 10.1016/j.cell.2025.03.034.
8
Precise engineering of gene expression by editing plasticity.通过编辑可塑性实现基因表达的精确工程。
Genome Biol. 2025 Mar 10;26(1):51. doi: 10.1186/s13059-025-03516-7.
9
Genetic Studies Through the Lens of Gene Networks.透过基因网络视角的遗传学研究。
Annu Rev Biomed Data Sci. 2025 Feb 20. doi: 10.1146/annurev-biodatasci-103123-095355.
10
mRNA-LM: full-length integrated SLM for mRNA analysis.mRNA-LM:用于mRNA分析的全长整合型单分子定位显微镜
Nucleic Acids Res. 2025 Jan 24;53(3). doi: 10.1093/nar/gkaf044.