• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

学习基因表达的调控密码。

Learning the Regulatory Code of Gene Expression.

作者信息

Zrimec Jan, Buric Filip, Kokina Mariia, Garcia Victor, Zelezniak Aleksej

机构信息

Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden.

Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark.

出版信息

Front Mol Biosci. 2021 Jun 10;8:673363. doi: 10.3389/fmolb.2021.673363. eCollection 2021.

DOI:10.3389/fmolb.2021.673363
PMID:34179082
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8223075/
Abstract

Data-driven machine learning is the method of choice for predicting molecular phenotypes from nucleotide sequence, modeling gene expression events including protein-DNA binding, chromatin states as well as mRNA and protein levels. Deep neural networks automatically learn informative sequence representations and interpreting them enables us to improve our understanding of the regulatory code governing gene expression. Here, we review the latest developments that apply shallow or deep learning to quantify molecular phenotypes and decode the -regulatory grammar from prokaryotic and eukaryotic sequencing data. Our approach is to build from the ground up, first focusing on the initiating protein-DNA interactions, then specific coding and non-coding regions, and finally on advances that combine multiple parts of the gene and mRNA regulatory structures, achieving unprecedented performance. We thus provide a quantitative view of gene expression regulation from nucleotide sequence, concluding with an information-centric overview of the central dogma of molecular biology.

摘要

数据驱动的机器学习是从核苷酸序列预测分子表型、对包括蛋白质-DNA结合、染色质状态以及mRNA和蛋白质水平在内的基因表达事件进行建模的首选方法。深度神经网络能自动学习信息丰富的序列表示,对其进行解读有助于我们加深对调控基因表达的规则的理解。在这里,我们回顾了将浅层或深度学习应用于量化分子表型并从原核和真核生物测序数据中解码调控语法的最新进展。我们的方法是从头开始构建,首先关注起始的蛋白质-DNA相互作用,然后是特定的编码和非编码区域,最后是结合基因和mRNA调控结构多个部分的进展,从而实现了前所未有的性能。因此,我们从核苷酸序列提供了基因表达调控的定量观点,并以分子生物学中心法则的以信息为中心的概述作为总结。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/adac/8223075/722808f3020a/fmolb-08-673363-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/adac/8223075/745cd22e21cf/fmolb-08-673363-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/adac/8223075/fec3871b76f4/fmolb-08-673363-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/adac/8223075/722808f3020a/fmolb-08-673363-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/adac/8223075/745cd22e21cf/fmolb-08-673363-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/adac/8223075/fec3871b76f4/fmolb-08-673363-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/adac/8223075/722808f3020a/fmolb-08-673363-g003.jpg

相似文献

1
Learning the Regulatory Code of Gene Expression.学习基因表达的调控密码。
Front Mol Biosci. 2021 Jun 10;8:673363. doi: 10.3389/fmolb.2021.673363. eCollection 2021.
2
Discovering epistatic feature interactions from neural network models of regulatory DNA sequences.从调控 DNA 序列的神经网络模型中发现上位特征相互作用。
Bioinformatics. 2018 Sep 1;34(17):i629-i637. doi: 10.1093/bioinformatics/bty575.
3
Genome-wide prediction of cis-regulatory regions using supervised deep learning methods.基于监督深度学习方法的全基因组顺式调控区预测。
BMC Bioinformatics. 2018 May 31;19(1):202. doi: 10.1186/s12859-018-2187-1.
4
Deep learning suggests that gene expression is encoded in all parts of a co-evolving interacting gene regulatory structure.深度学习表明,基因表达是由共同进化的相互作用基因调控结构的所有部分编码的。
Nat Commun. 2020 Dec 1;11(1):6141. doi: 10.1038/s41467-020-19921-4.
5
Learning and interpreting the gene regulatory grammar in a deep learning framework.在深度学习框架中学习和解释基因调控语法。
PLoS Comput Biol. 2020 Nov 2;16(11):e1008334. doi: 10.1371/journal.pcbi.1008334. eCollection 2020 Nov.
6
The identification of cis-regulatory elements: A review from a machine learning perspective.顺式调控元件的识别:基于机器学习视角的综述
Biosystems. 2015 Dec;138:6-17. doi: 10.1016/j.biosystems.2015.10.002. Epub 2015 Oct 21.
7
Off the deep end: What can deep learning do for the gene expression field?陷入困境:深度学习能为基因表达领域做些什么?
J Biol Chem. 2023 Jan;299(1):102760. doi: 10.1016/j.jbc.2022.102760. Epub 2022 Nov 30.
8
Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks.巴塞特:利用深度卷积神经网络学习可及基因组的调控密码。
Genome Res. 2016 Jul;26(7):990-9. doi: 10.1101/gr.200535.115. Epub 2016 May 3.
9
Chromatin accessibility prediction via a hybrid deep convolutional neural network.基于混合深度卷积神经网络的染色质可及性预测。
Bioinformatics. 2018 Mar 1;34(5):732-738. doi: 10.1093/bioinformatics/btx679.
10
Seven myths of how transcription factors read the cis-regulatory code.转录因子如何解读顺式调控密码的七个误区。
Curr Opin Syst Biol. 2020 Oct;23:22-31. doi: 10.1016/j.coisb.2020.08.002. Epub 2020 Sep 4.

引用本文的文献

1
A comparative study of flaxseed gum effect on Lactobacillus acidophilus genes expression, and textural, sensory, structural, and microbiological properties of synbiotic Iranian white cheese.亚麻籽胶对嗜酸乳杆菌基因表达以及伊朗合生元白奶酪的质地、感官、结构和微生物特性影响的比较研究。
Sci Rep. 2025 Aug 29;15(1):31902. doi: 10.1038/s41598-025-17819-z.
2
Learning the sequence code of protein expression in human immune cells.了解人类免疫细胞中蛋白质表达的序列编码。
Sci Adv. 2025 Jul 25;11(30):eads0510. doi: 10.1126/sciadv.ads0510. Epub 2025 Jul 23.
3
UTRGAN: learning to generate 5' UTR sequences for optimized translation efficiency and gene expression.

本文引用的文献

1
Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead.停止为高风险决策解释黑箱机器学习模型,转而使用可解释模型。
Nat Mach Intell. 2019 May;1(5):206-215. doi: 10.1038/s42256-019-0048-x. Epub 2019 May 13.
2
Protein Abundance Prediction Through Machine Learning Methods.通过机器学习方法进行蛋白质丰度预测
J Mol Biol. 2021 Nov 5;433(22):167267. doi: 10.1016/j.jmb.2021.167267. Epub 2021 Sep 23.
3
Improving representations of genomic sequence motifs in convolutional networks with exponential activations.
UTRGAN:学习生成5'非翻译区序列以优化翻译效率和基因表达。
Bioinform Adv. 2025 Jun 10;5(1):vbaf134. doi: 10.1093/bioadv/vbaf134. eCollection 2025.
4
Using supervised machine-learning approaches to understand abiotic stress tolerance and design resilient crops.利用监督式机器学习方法来理解非生物胁迫耐受性并设计抗逆作物。
Philos Trans R Soc Lond B Biol Sci. 2025 May 29;380(1927):20240252. doi: 10.1098/rstb.2024.0252.
5
BAC-browser: the tool for synthetic biology.BAC浏览器:合成生物学工具。
BMC Bioinformatics. 2025 Jan 23;26(1):27. doi: 10.1186/s12859-025-06049-9.
6
Predictive Modeling of Gene Expression and Localization of DNA Binding Site Using Deep Convolutional Neural Networks.使用深度卷积神经网络对基因表达进行预测建模及DNA结合位点定位
bioRxiv. 2024 Dec 20:2024.12.17.629042. doi: 10.1101/2024.12.17.629042.
7
Promoters in Pichia pastoris: A Toolbox for Fine-Tuned Gene Expression.巴斯德毕赤酵母启动子:精细基因表达的工具盒。
Methods Mol Biol. 2024;2844:159-178. doi: 10.1007/978-1-0716-4063-0_11.
8
Predicting transcriptional responses to heat and drought stress from genomic features using a machine learning approach in rice.利用机器学习方法从基因组特征预测水稻对高温和干旱胁迫的转录反应。
Front Plant Sci. 2023 Jul 17;14:1212073. doi: 10.3389/fpls.2023.1212073. eCollection 2023.
9
Effective design and inference for cell sorting and sequencing based massively parallel reporter assays.基于大规模平行报告基因检测的细胞分选和测序的有效设计与推断。
Bioinformatics. 2023 May 4;39(5). doi: 10.1093/bioinformatics/btad277.
10
Strategies for effectively modelling promoter-driven gene expression using transfer learning.使用迁移学习有效模拟启动子驱动基因表达的策略。
bioRxiv. 2024 May 19:2023.02.24.529941. doi: 10.1101/2023.02.24.529941.
利用指数激活函数改进卷积网络中基因组序列基序的表示。
Nat Mach Intell. 2021 Mar;3(3):258-266. doi: 10.1038/s42256-020-00291-x. Epub 2021 Feb 8.
4
Performance of Regression Models as a Function of Experiment Noise.回归模型的性能作为实验噪声的函数
Bioinform Biol Insights. 2021 Jun 27;15:11779322211020315. doi: 10.1177/11779322211020315. eCollection 2021.
5
Predicting enhancer-promoter interaction from genomic sequence with deep neural networks.利用深度神经网络从基因组序列预测增强子-启动子相互作用。
Quant Biol. 2019 Jun;7(2):122-137. doi: 10.1007/s40484-019-0154-0.
6
A self-attention model for inferring cooperativity between regulatory features.用于推断调控特征之间协同性的自注意力模型。
Nucleic Acids Res. 2021 Jul 21;49(13):e77. doi: 10.1093/nar/gkab349.
7
Base-resolution models of transcription-factor binding reveal soft motif syntax.基于分辨率的转录因子结合模型揭示了软基序语法。
Nat Genet. 2021 Mar;53(3):354-366. doi: 10.1038/s41588-021-00782-6. Epub 2021 Feb 18.
8
DeepGRN: prediction of transcription factor binding site across cell-types using attention-based deep neural networks.DeepGRN:基于注意力机制的深度神经网络跨细胞类型预测转录因子结合位点
BMC Bioinformatics. 2021 Feb 1;22(1):38. doi: 10.1186/s12859-020-03952-1.
9
Evaluation of deep and shallow learning methods in chemogenomics for the prediction of drugs specificity.化学基因组学中用于预测药物特异性的深度学习和浅度学习方法评估。
J Cheminform. 2020 Feb 10;12(1):11. doi: 10.1186/s13321-020-0413-0.
10
Evaluating Protein Transfer Learning with TAPE.使用TAPE评估蛋白质迁移学习。
Adv Neural Inf Process Syst. 2019 Dec;32:9689-9701.