Department of Biomedical Engineering, Michigan State University, East Lansing, MI, USA.
Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI, USA.
Sci Rep. 2023 May 12;13(1):7742. doi: 10.1038/s41598-023-34115-w.
The Brain and Muscle ARNTL-Like 1 protein (BMAL1) forms a heterodimer with either Circadian Locomotor Output Cycles Kaput (CLOCK) or Neuronal PAS domain protein 2 (NPAS2) to act as a master regulator of the mammalian circadian clock gene network. The dimer binds to E-box gene regulatory elements on DNA, activating downstream transcription of clock genes. Identification of transcription factor binding sites and genomic features that correlate to DNA binding by BMAL1 is a challenging problem, given that CLOCK-BMAL1 or NPAS2-BMAL1 bind to several distinct binding motifs (CANNTG) on DNA. Using three different types of tissue-specific machine learning models with features based on (1) DNA sequence, (2) DNA sequence plus DNA shape, and (3) DNA sequence and shape plus histone modifications, we developed an interpretable predictive model of genome-wide BMAL1 binding to E-box motifs and dissected the mechanisms underlying BMAL1-DNA binding. Our results indicated that histone modifications, the local shape of the DNA, and the flanking sequence of the E-box motif are sufficient predictive features for BMAL1-DNA binding. Our models also provide mechanistic insights into tissue specificity of DNA binding by BMAL1.
大脑和肌肉 ARNTL 样蛋白 1(BMAL1)与昼夜节律运动输出周期 kaput(CLOCK)或神经元 PAS 域蛋白 2(NPAS2)形成异二聚体,作为哺乳动物生物钟基因网络的主要调节剂。二聚体结合 DNA 上的 E 盒基因调控元件,激活下游时钟基因的转录。鉴于 CLOCK-BMAL1 或 NPAS2-BMAL1 可与 DNA 上的几种不同的结合基序(CANNTG)结合,因此鉴定与 BMAL1 的 DNA 结合相关的转录因子结合位点和基因组特征是一个具有挑战性的问题。使用基于(1)DNA 序列、(2)DNA 序列加 DNA 形状和(3)DNA 序列和形状加组蛋白修饰的三种不同类型的组织特异性机器学习模型,我们开发了一种可解释的全基因组 BMAL1 与 E 盒基序结合的预测模型,并剖析了 BMAL1-DNA 结合的机制。我们的结果表明,组蛋白修饰、DNA 的局部形状和 E 盒基序的侧翼序列是 BMAL1-DNA 结合的充分预测特征。我们的模型还为 BMAL1 的 DNA 结合的组织特异性提供了机制上的见解。