参与并预测：通过对染色质的选择性关注理解基因调控

Attend and Predict: Understanding Gene Regulation by Selective Attention on Chromatin.

作者信息

Singh Ritambhara, Lanchantin Jack, Sekhon Arshdeep, Qi Yanjun

机构信息

Department of Computer Science, University of Virginia.

出版信息

Adv Neural Inf Process Syst. 2017 Dec;30:6785-6795.

PMID:30147283

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6105294/

Abstract

The past decade has seen a revolution in genomic technologies that enabled a flood of genome-wide profiling of chromatin marks. Recent literature tried to understand gene regulation by predicting gene expression from large-scale chromatin measurements. Two fundamental challenges exist for such learning tasks: (1) genome-wide chromatin signals are spatially structured, high-dimensional and highly modular; and (2) the core aim is to understand what the relevant factors are and how they work together. Previous studies either failed to model complex dependencies among input signals or relied on separate feature analysis to explain the decisions. This paper presents an attention-based deep learning approach, AttentiveChrome, that uses a unified architecture to model and to interpret dependencies among chromatin factors for controlling gene regulation. AttentiveChrome uses a hierarchy of multiple Long Short-Term Memory (LSTM) modules to encode the input signals and to model how various chromatin marks cooperate automatically. AttentiveChrome trains two levels of attention jointly with the target prediction, enabling it to attend differentially to relevant marks and to locate important positions per mark. We evaluate the model across 56 different cell types (tasks) in humans. Not only is the proposed architecture more accurate, but its attention scores provide a better interpretation than state-of-the-art feature visualization methods such as saliency maps.

摘要

过去十年见证了基因组技术的一场革命，这场革命使得大量全基因组染色质标记谱得以实现。近期的文献试图通过从大规模染色质测量中预测基因表达来理解基因调控。此类学习任务存在两个基本挑战：（1）全基因组染色质信号具有空间结构、高维度且高度模块化；（2）核心目标是理解相关因素是什么以及它们如何共同起作用。先前的研究要么未能对输入信号之间的复杂依赖关系进行建模，要么依赖于单独的特征分析来解释决策。本文提出了一种基于注意力的深度学习方法AttentiveChrome，该方法使用统一架构对染色质因子之间的依赖关系进行建模和解释，以控制基因调控。AttentiveChrome使用多个长短期记忆（LSTM）模块的层次结构来编码输入信号，并对各种染色质标记如何自动协作进行建模。AttentiveChrome与目标预测一起联合训练两个层次的注意力，使其能够对相关标记进行差异化关注，并定位每个标记的重要位置。我们在人类的56种不同细胞类型（任务）上评估了该模型。所提出的架构不仅更准确，而且其注意力得分比诸如显著性图等现有最先进的特征可视化方法提供了更好的解释。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00b8/6105294/85e3e85b7aa9/nihms935717f1.jpg

相似文献

Attend and Predict: Understanding Gene Regulation by Selective Attention on Chromatin.参与并预测：通过对染色质的选择性关注理解基因调控

Adv Neural Inf Process Syst. 2017 Dec;30:6785-6795.

DeepDiff: DEEP-learning for predicting DIFFerential gene expression from histone modifications.DeepDiff：基于深度学习的组蛋白修饰差异基因表达预测方法。

Bioinformatics. 2018 Sep 1;34(17):i891-i900. doi: 10.1093/bioinformatics/bty612.

Learning to Monitor Machine Health with Convolutional Bi-Directional LSTM Networks.使用卷积双向长短期记忆网络学习监测机器健康状况。

Sensors (Basel). 2017 Jan 30;17(2):273. doi: 10.3390/s17020273.

Engineering Aspects of Olfaction嗅觉的工程学方面

Spatiotemporal Recurrent Convolutional Networks for Traffic Prediction in Transportation Networks.用于交通网络中交通流量预测的时空递归卷积网络

Sensors (Basel). 2017 Jun 26;17(7):1501. doi: 10.3390/s17071501.

Predicting chromatin organization using histone marks.利用组蛋白标记预测染色质组织

Genome Biol. 2015 Aug 14;16(1):162. doi: 10.1186/s13059-015-0740-z.

De novo prediction of human chromosome structures: Epigenetic marking patterns encode genome architecture.从头预测人类染色体结构：表观遗传标记模式编码基因组结构。

Proc Natl Acad Sci U S A. 2017 Nov 14;114(46):12126-12131. doi: 10.1073/pnas.1714980114. Epub 2017 Oct 31.

Assessment and statistical modeling of the relationship between remotely sensed aerosol optical depth and PM2.5 in the eastern United States.美国东部地区遥感气溶胶光学厚度与PM2.5之间关系的评估及统计建模

Res Rep Health Eff Inst. 2012 May(167):5-83; discussion 85-91.

Top-down attention based on object representation and incremental memory for knowledge building and inference.基于对象表示和增量记忆的自上而下的注意力，用于知识构建和推理。

Neural Netw. 2013 Oct;46:9-22. doi: 10.1016/j.neunet.2013.04.002. Epub 2013 Apr 8.

'Traffic light rules': Chromatin states direct miRNA-mediated network motifs running by integrating epigenome and regulatome.“交通灯规则”：染色质状态通过整合表观基因组和调控组来指导miRNA介导的网络基序运行。

Biochim Biophys Acta. 2016 Jul;1860(7):1475-88. doi: 10.1016/j.bbagen.2016.04.008. Epub 2016 Apr 14.

引用本文的文献

Multimodal integration strategies for clinical application in oncology.肿瘤学临床应用中的多模态整合策略

Front Pharmacol. 2025 Aug 20;16:1609079. doi: 10.3389/fphar.2025.1609079. eCollection 2025.

Towards universal modeling of transcript isoform expression levels.迈向转录本异构体表达水平的通用建模

bioRxiv. 2025 Jul 25:2025.07.21.665977. doi: 10.1101/2025.07.21.665977.

Machine learning on multiple epigenetic features reveals H3K27Ac as a driver of gene expression prediction across patients with glioblastoma.基于多种表观遗传特征的机器学习揭示了H3K27Ac是胶质母细胞瘤患者基因表达预测的驱动因素。

PLoS Comput Biol. 2025 Aug 7;21(8):e1012272. doi: 10.1371/journal.pcbi.1012272. eCollection 2025 Aug.

Osteoarthritis progression pattern based on patient specific characteristics using machine learning.基于患者特定特征，运用机器学习的骨关节炎进展模式。

NPJ Digit Med. 2025 Jul 21;8(1):464. doi: 10.1038/s41746-025-01878-7.

Machine Learning-Based Prediction Model for Predicting the Effect of the Serum γKlotho Level on Susceptibility to Coronary Heart Disease.基于机器学习的预测模型：预测血清γ-klotho水平对冠心病易感性的影响

Vasc Health Risk Manag. 2025 May 27;21:425-436. doi: 10.2147/VHRM.S508351. eCollection 2025.

Federated transfer learning with differential privacy for multi-omics survival analysis.用于多组学生存分析的具有差分隐私的联邦迁移学习

Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf166.

Interpretable multimodal deep learning model for predicting post-surgical international society of urological pathology grade in primary prostate cancer.用于预测原发性前列腺癌术后国际泌尿病理学会分级的可解释多模态深度学习模型

Eur J Nucl Med Mol Imaging. 2025 Apr 4. doi: 10.1007/s00259-025-07248-5.

Dynamic Gene Attention Focus (DyGAF): Enhancing Biomarker Identification Through Dual-Model Attention Networks.动态基因注意力焦点（DyGAF）：通过双模型注意力网络增强生物标志物识别

Bioinform Biol Insights. 2025 Mar 27;19:11779322251325390. doi: 10.1177/11779322251325390. eCollection 2025.

A novel integrative multimodal classifier to enhance the diagnosis of Parkinson's disease.一种用于增强帕金森病诊断的新型综合多模态分类器。

Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf088.

Elevated few-shot network intrusion detection via self-attention mechanisms and iterative refinement.通过自注意力机制和迭代优化实现的少样本网络入侵检测性能提升

PLoS One. 2025 Jan 16;20(1):e0317713. doi: 10.1371/journal.pone.0317713. eCollection 2025.

本文引用的文献

DEEP MOTIF DASHBOARD: VISUALIZING AND UNDERSTANDING GENOMIC SEQUENCES USING DEEP NEURAL NETWORKS.深度基序仪表盘：使用深度神经网络可视化和理解基因组序列

Pac Symp Biocomput. 2017;22:254-265. doi: 10.1142/9789813207813_0025.

DeepChrome: deep-learning for predicting gene expression from histone modifications.深度铬：用于从组蛋白修饰预测基因表达的深度学习

Bioinformatics. 2016 Sep 1;32(17):i639-i648. doi: 10.1093/bioinformatics/btw427.

Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks.巴塞特：利用深度卷积神经网络学习可及基因组的调控密码。

Genome Res. 2016 Jul;26(7):990-9. doi: 10.1101/gr.200535.115. Epub 2016 May 3.

DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences.DanQ：一种用于量化DNA序列功能的卷积与循环相结合的深度神经网络。

Nucleic Acids Res. 2016 Jun 20;44(11):e107. doi: 10.1093/nar/gkw226. Epub 2016 Apr 15.

Predicting effects of noncoding variants with deep learning-based sequence model.使用基于深度学习的序列模型预测非编码变异的影响。

Nat Methods. 2015 Oct;12(10):931-4. doi: 10.1038/nmeth.3547. Epub 2015 Aug 24.

Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning.通过深度学习预测 DNA 和 RNA 结合蛋白的序列特异性。

Nat Biotechnol. 2015 Aug;33(8):831-8. doi: 10.1038/nbt.3300. Epub 2015 Jul 27.

Integrative analysis of 111 reference human epigenomes.111 个人类参考基因组的综合分析。

Nature. 2015 Feb 19;518(7539):317-30. doi: 10.1038/nature14248.

Polycomb repressive complex 2 and H3K27me3 cooperate with H3K9 methylation to maintain heterochromatin protein 1α at chromatin.多梳抑制复合物 2 和 H3K27me3 与 H3K9 甲基化合作，将异染色质蛋白 1α 维持在染色质上。

Mol Cell Biol. 2014 Oct 1;34(19):3662-74. doi: 10.1128/MCB.00205-14. Epub 2014 Jul 21.

The correlation between histone modifications and gene expression.组蛋白修饰与基因表达之间的相关性。

Epigenomics. 2013 Apr;5(2):113-6. doi: 10.2217/epi.13.13.

Modeling gene expression using chromatin features in various cellular contexts.使用各种细胞环境中的染色质特征进行基因表达建模。

Genome Biol. 2012 Jun 13;13(9):R53. doi: 10.1186/gb-2012-13-9-r53.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验