Hwang John, Kang Xuedong, Wolf Charlotte, Touma Marlin
Neonatal/Congenital Heart Laboratory, Cardiovascular Research Laboratory, University of California Los Angeles, Los Angeles, CA.
Department of Pediatrics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA.
bioRxiv. 2023 Nov 5:2023.11.04.565657. doi: 10.1101/2023.11.04.565657.
Long non-coding RNA (lncRNA) mediated transcriptional regulation is increasingly recognized as an important gene regulatory mechanism during development and disease. LncRNAs are emerging as critical regulators of chromatin state; yet the nature and the extent of their interactions with chromatin remain to be fully revealed. We have previously identified as an essential epigenetic regulator of myogenic differentiation in cardiac and skeletal myocytes in mice and humans. We further demonstrated that function is mediated by the interaction with the chromatin-modifying complex polycomb repressive complex 2 (PRC2) at the promoter of myogenic differentiation transcription factors, and . Herein, we employed an unbiased chromatin isolation by RNA purification (ChIRP) and high throughput sequencing to map the repertoire of chromatin occupancy genome-wide in the mouse muscle myoblast cell line. We uncovered a total of 99732 true peaks corresponding to binding sites at high confidence (-value < 1e-5 and enrichment score ≥ 10). The -binding sites averaged 558 bp in length and were distributed widely within the coding and non-coding regions of the genome. Approximately 46% of these true peaks were mapped to gene elements, of which 1180 were mapped to experimentally validated promoter sequences. Importantly, the promoter-mapped binding sites were enriched in myogenic transcription factors and heart development while exhibiting focal interactions with known motifs of proximal promoters and transcription initiation by RNA polII, including TATA, transcription initiator, CCAAT-box, and GC-box, supporting role in transcription initiation of myogenic regulators. Remarkably, nearly 40% of -binding sites mapped to gene introns, were enriched with the Homeobox family of transcription factors, and exhibited TA-rich motif sequences, suggesting potential motif specific -bound introns. Lastly, more than 136521enhancer sequences were detected in -occupancy sites at high confidence. Among these enhancers,12% exhibited cell type/tissue-specific enrichment in fetal heart and muscles. Together, our findings provide further insights into the genome-wide Chromatin interactome that may potentially dictate its function in myogenic differentiation and potentially other cellular and biological processes.
长链非编码RNA(lncRNA)介导的转录调控日益被认为是发育和疾病过程中的一种重要基因调控机制。LncRNAs正成为染色质状态的关键调节因子;然而,它们与染色质相互作用的性质和程度仍有待充分揭示。我们之前已确定其为小鼠和人类心脏及骨骼肌成肌细胞中肌源性分化的重要表观遗传调节因子。我们进一步证明,其功能是通过与成肌分化转录因子MyoD和Myogenin启动子处的染色质修饰复合物多梳抑制复合物2(PRC2)相互作用介导的。在此,我们采用RNA纯化染色质分离法(ChIRP)和高通量测序技术,在小鼠肌肉成肌细胞系中全基因组范围内绘制其染色质占据图谱。我们共发现99732个对应于其结合位点的真实峰,置信度较高(P值<1e-5且富集分数≥10)。其结合位点平均长度为558 bp,广泛分布于基因组的编码区和非编码区。这些真实峰中约46%映射到基因元件,其中1180个映射到经实验验证的启动子序列。重要的是,映射到启动子的结合位点在成肌转录因子和心脏发育中富集,同时与近端启动子的已知基序和RNA聚合酶II的转录起始表现出局部相互作用,包括TATA、转录起始位点、CCAAT框和GC框,支持其在成肌调节因子转录起始中的作用。值得注意的是,近40%映射到基因内含子的结合位点富含同源框转录因子家族,并表现出富含TA的基序序列,表明潜在的基序特异性结合内含子。最后,在其高置信度占据位点中检测到超过136521个增强子序列。在这些增强子中,12%在胎儿心脏和肌肉中表现出细胞类型/组织特异性富集。总之,我们的研究结果为全基因组范围内的染色质相互作用组提供了进一步的见解,这可能潜在地决定其在成肌分化以及潜在的其他细胞和生物学过程中的功能。