Program in Epithelial Biology, Stanford University School of Medicine, Stanford, CA, USA.
Program in Biomedical Informatics, Stanford University, Stanford, CA, USA.
Nat Genet. 2021 Nov;53(11):1564-1576. doi: 10.1038/s41588-021-00947-3. Epub 2021 Oct 14.
Transcription factors bind DNA sequence motif vocabularies in cis-regulatory elements (CREs) to modulate chromatin state and gene expression during cell state transitions. A quantitative understanding of how motif lexicons influence dynamic regulatory activity has been elusive due to the combinatorial nature of the cis-regulatory code. To address this, we undertook multiomic data profiling of chromatin and expression dynamics across epidermal differentiation to identify 40,103 dynamic CREs associated with 3,609 dynamically expressed genes, then applied an interpretable deep-learning framework to model the cis-regulatory logic of chromatin accessibility. This analysis framework identified cooperative DNA sequence rules in dynamic CREs regulating synchronous gene modules with diverse roles in skin differentiation. Massively parallel reporter assay analysis validated temporal dynamics and cooperative cis-regulatory logic. Variants linked to human polygenic skin disease were enriched in these time-dependent combinatorial motif rules. This integrative approach shows the combinatorial cis-regulatory lexicon of epidermal differentiation and represents a general framework for deciphering the organizational principles of the cis-regulatory code of dynamic gene regulation.
转录因子结合顺式调控元件 (CRE) 中的 DNA 序列基序词汇,以在细胞状态转变过程中调节染色质状态和基因表达。由于顺式调控密码的组合性质, motif 词汇如何影响动态调控活性的定量理解一直难以捉摸。为了解决这个问题,我们对表皮分化过程中的染色质和表达动态进行了多组学数据分析,以鉴定 40103 个与 3609 个动态表达基因相关的动态 CRE,然后应用可解释的深度学习框架来模拟染色质可及性的顺式调控逻辑。该分析框架在调节具有不同皮肤分化作用的同步基因模块的动态 CRE 中识别了协同的 DNA 序列规则。大规模平行报告基因分析验证了时间动态和协同顺式调控逻辑。与人类多基因皮肤疾病相关的变体在这些随时间变化的组合 motif 规则中富集。这种综合方法展示了表皮分化的组合顺式调控词汇,并代表了解码动态基因调控的顺式调控代码组织原则的一般框架。