使用深度学习模型从DNA序列预测基因表达。

Predicting gene expression from DNA sequence using deep learning models.

作者信息

Barbadilla-Martínez Lucía, Klaassen Noud, van Steensel Bas, de Ridder Jeroen

机构信息

Oncode Institute, Utrecht, The Netherlands.

Center for Molecular Medicine, UMC Utrecht, Utrecht, The Netherlands.

出版信息

Nat Rev Genet. 2025 May 13. doi: 10.1038/s41576-025-00841-2.

Transcription of genes is regulated by DNA elements such as promoters and enhancers, the activity of which are in turn controlled by many transcription factors. Owing to the highly complex combinatorial logic involved, it has been difficult to construct computational models that predict gene activity from DNA sequence. Recent advances in deep learning techniques applied to data from epigenome mapping and high-throughput reporter assays have made substantial progress towards addressing this complexity. Such models can capture the regulatory grammar with remarkable accuracy and show great promise in predicting the effects of non-coding variants, uncovering detailed molecular mechanisms of gene regulation and designing synthetic regulatory elements for biotechnology. Here, we discuss the principles of these approaches, the types of training data sets that are available and the strengths and limitations of different approaches.

基因的转录受启动子和增强子等DNA元件调控，而这些元件的活性又由许多转录因子控制。由于涉及高度复杂的组合逻辑，构建从DNA序列预测基因活性的计算模型一直很困难。应用于表观基因组图谱数据和高通量报告基因检测的深度学习技术的最新进展，在解决这种复杂性方面取得了重大进展。此类模型能够以极高的准确性捕捉调控语法，在预测非编码变异的影响、揭示基因调控的详细分子机制以及为生物技术设计合成调控元件方面展现出巨大潜力。在此，我们讨论这些方法的原理、可用的训练数据集类型以及不同方法的优缺点。

Predicting gene expression from DNA sequence using deep learning models.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献