Suppr超能文献

评估用于预测表观基因组图谱的深度学习。

Evaluating deep learning for predicting epigenomic profiles.

作者信息

Toneyan Shushan, Tang Ziqi, Koo Peter K

机构信息

Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.

出版信息

Nat Mach Intell. 2022 Dec;4(12):1088-1100. doi: 10.1038/s42256-022-00570-9. Epub 2022 Dec 5.

Abstract

Deep learning has been successful at predicting epigenomic profiles from DNA sequences. Most approaches frame this task as a binary classification relying on peak callers to define functional activity. Recently, quantitative models have emerged to directly predict the experimental coverage values as a regression. As new models continue to emerge with different architectures and training configurations, a major bottleneck is forming due to the lack of ability to fairly assess the novelty of proposed models and their utility for downstream biological discovery. Here we introduce a unified evaluation framework and use it to compare various binary and quantitative models trained to predict chromatin accessibility data. We highlight various modeling choices that affect generalization performance, including a downstream application of predicting variant effects. In addition, we introduce a robustness metric that can be used to enhance model selection and improve variant effect predictions. Our empirical study largely supports that quantitative modeling of epigenomic profiles leads to better generalizability and interpretability.

摘要

深度学习在从DNA序列预测表观基因组图谱方面取得了成功。大多数方法将此任务视为基于峰检测工具来定义功能活性的二分类问题。最近,定量模型已出现,可直接将实验覆盖值预测为回归问题。随着具有不同架构和训练配置的新模型不断涌现,由于缺乏公平评估所提出模型的新颖性及其对下游生物学发现的效用的能力,一个主要瓶颈正在形成。在此,我们引入了一个统一的评估框架,并使用它来比较为预测染色质可及性数据而训练的各种二分类和定量模型。我们强调了各种影响泛化性能的建模选择,包括预测变异效应的下游应用。此外,我们引入了一种稳健性度量,可用于加强模型选择并改进变异效应预测。我们的实证研究在很大程度上支持表观基因组图谱的定量建模可带来更好的泛化性和可解释性。

相似文献

1
Evaluating deep learning for predicting epigenomic profiles.
Nat Mach Intell. 2022 Dec;4(12):1088-1100. doi: 10.1038/s42256-022-00570-9. Epub 2022 Dec 5.
3
DeepHistone: a deep learning approach to predicting histone modifications.
BMC Genomics. 2019 Apr 4;20(Suppl 2):193. doi: 10.1186/s12864-019-5489-4.
4
Adaptive Hierarchical Similarity Metric Learning With Noisy Labels.
IEEE Trans Image Process. 2023;32:1245-1256. doi: 10.1109/TIP.2023.3242148.
6
Comparative Study of Deep Generative Models on Chemical Space Coverage.
J Chem Inf Model. 2021 Jun 28;61(6):2572-2581. doi: 10.1021/acs.jcim.0c01328. Epub 2021 May 20.
7
8
Interpretability-Guided Inductive Bias For Deep Learning Based Medical Image.
Med Image Anal. 2022 Oct;81:102551. doi: 10.1016/j.media.2022.102551. Epub 2022 Jul 22.
10
The role of unpaired image-to-image translation for stain color normalization in colorectal cancer histology classification.
Comput Methods Programs Biomed. 2023 Jun;234:107511. doi: 10.1016/j.cmpb.2023.107511. Epub 2023 Mar 26.

引用本文的文献

1
Machine learning tools for deciphering the regulatory logic of enhancers in health and disease.
Front Genet. 2025 Aug 13;16:1603687. doi: 10.3389/fgene.2025.1603687. eCollection 2025.
2
Base-resolution binding profile prediction of proteins on RNAs with deep learning.
Nucleic Acids Res. 2025 Jul 19;53(14). doi: 10.1093/nar/gkaf748.
3
Evaluating the representational power of pre-trained DNA language models for regulatory genomics.
Genome Biol. 2025 Jul 14;26(1):203. doi: 10.1186/s13059-025-03674-8.
4
Perspective on recent developments and challenges in regulatory and systems genomics.
Bioinform Adv. 2025 May 9;5(1):vbaf106. doi: 10.1093/bioadv/vbaf106. eCollection 2025.
6
Gauge fixing for sequence-function relationships.
PLoS Comput Biol. 2025 Mar 20;21(3):e1012818. doi: 10.1371/journal.pcbi.1012818. eCollection 2025.
7
Iterative improvement of deep learning models using synthetic regulatory genomics.
bioRxiv. 2025 Feb 21:2025.02.04.636130. doi: 10.1101/2025.02.04.636130.
8
Interpreting -regulatory mechanisms from genomic deep neural networks using surrogate models.
Nat Mach Intell. 2024 Jun;6(6):701-713. doi: 10.1038/s42256-024-00851-5. Epub 2024 Jun 21.
9
Advancing Regulatory Genomics With Machine Learning.
Bioinform Biol Insights. 2024 Dec 24;18:11779322241249562. doi: 10.1177/11779322241249562. eCollection 2024.

本文引用的文献

1
A sequence-based global map of regulatory activity for deciphering human genetics.
Nat Genet. 2022 Jul;54(7):940-949. doi: 10.1038/s41588-022-01102-2. Epub 2022 Jul 11.
2
Sequence-based modeling of three-dimensional genome architecture from kilobase to chromosome scale.
Nat Genet. 2022 May;54(5):725-734. doi: 10.1038/s41588-022-01065-4. Epub 2022 May 12.
3
DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers.
Nat Genet. 2022 May;54(5):613-624. doi: 10.1038/s41588-022-01048-5. Epub 2022 May 12.
4
Chromatin interaction-aware gene regulatory modeling with graph attention networks.
Genome Res. 2022 May;32(5):930-944. doi: 10.1101/gr.275870.121. Epub 2022 Apr 8.
5
The evolution, evolvability and engineering of gene regulatory DNA.
Nature. 2022 Mar;603(7901):455-463. doi: 10.1038/s41586-022-04506-6. Epub 2022 Mar 9.
6
Decoding gene regulation in the fly brain.
Nature. 2022 Jan;601(7894):630-636. doi: 10.1038/s41586-021-04262-z. Epub 2022 Jan 5.
7
Analysis of long and short enhancers in melanoma cell states.
Elife. 2021 Dec 7;10:e71735. doi: 10.7554/eLife.71735.
8
JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles.
Nucleic Acids Res. 2022 Jan 7;50(D1):D165-D173. doi: 10.1093/nar/gkab1113.
9
The dynamic, combinatorial cis-regulatory lexicon of epidermal differentiation.
Nat Genet. 2021 Nov;53(11):1564-1576. doi: 10.1038/s41588-021-00947-3. Epub 2021 Oct 14.
10
Effective gene expression prediction from sequence by integrating long-range interactions.
Nat Methods. 2021 Oct;18(10):1196-1203. doi: 10.1038/s41592-021-01252-x. Epub 2021 Oct 4.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验