Marand Alexandre P
Department of Genetics, University of Georgia, Athens, USA.
bioRxiv. 2023 Jan 21:2023.01.20.524917. doi: 10.1101/2023.01.20.524917.
The blueprints to development, response to the environment, and cellular function are largely the manifestation of distinct gene expression programs controlled by the spatiotemporal activity of -regulatory elements. Although biochemical methods for identifying accessible chromatin - a hallmark of active -regulatory elements - have been developed, approaches capable of measuring and quantifying -regulatory activity are only beginning to be realized. Massively Parallel Reporter Assays coupled to chromatin accessibility profiling present a high-throughput solution for testing the transcription-activating capacity of millions of putatively regulatory DNA sequences in parallel. However, clear computational pipelines for analyzing these high-throughput sequencing-based reporter assays are lacking. In this protocol, I layout and rationalize a computational framework for the processing and analysis of Assay for Transposase Accessible Chromatin profiling followed by Self-Transcribed Active Regulatory Region sequencing (ATAC-STARR-seq) data from a recent study in The approach described herein can be adapted to other sequencing-based reporter assays and is largely agnostic to the model organism with the appropriate input substitutions.
发育、对环境的反应以及细胞功能的蓝图,很大程度上是由调控元件的时空活性所控制的不同基因表达程序的体现。尽管已经开发出用于识别可及染色质(活性调控元件的一个标志)的生化方法,但能够测量和量化调控活性的方法才刚刚开始实现。与染色质可及性分析相结合的大规模平行报告基因分析,为并行测试数百万个推定调控DNA序列的转录激活能力提供了一种高通量解决方案。然而,缺乏用于分析这些基于高通量测序的报告基因分析的清晰计算流程。在本方案中,我设计并合理化了一个计算框架,用于处理和分析来自最近一项研究的转座酶可及染色质分析(ATAC)随后进行自转录活性调控区域测序(STARR-seq)的数据。本文所述方法可适用于其他基于测序的报告基因分析,并且在进行适当的输入替换后,很大程度上与模式生物无关。