Department of Biological Sciences, Columbia University, New York, NY, USA.
Department of Systems Biology, Columbia University Medical Center, New York, NY, USA.
Nucleic Acids Res. 2023 Jun 23;51(11):5499-5511. doi: 10.1093/nar/gkad232.
Classic promoter mutagenesis strategies can be used to study how proximal promoter regions regulate the expression of particular genes of interest. This is a laborious process, in which the smallest sub-region of the promoter still capable of recapitulating expression in an ectopic setting is first identified, followed by targeted mutation of putative transcription factor binding sites. Massively parallel reporter assays such as survey of regulatory elements (SuRE) provide an alternative way to study millions of promoter fragments in parallel. Here we show how a generalized linear model (GLM) can be used to transform genome-scale SuRE data into a high-resolution genomic track that quantifies the contribution of local sequence to promoter activity. This coefficient track helps identify regulatory elements and can be used to predict promoter activity of any sub-region in the genome. It thus allows in silico dissection of any promoter in the human genome to be performed. We developed a web application, available at cissector.nki.nl, that lets researchers easily perform this analysis as a starting point for their research into any promoter of interest.
经典启动子诱变策略可用于研究近端启动子区域如何调节特定目的基因的表达。这是一个繁琐的过程,首先需要确定仍能够在外源环境中重现表达的启动子的最小亚区,然后再靶向突变假定的转录因子结合位点。大规模平行报告基因检测(如调控元件调查,SuRE)提供了一种并行研究数百万个启动子片段的替代方法。在这里,我们展示了如何使用广义线性模型(GLM)将基因组规模的 SuRE 数据转换为高分辨率基因组轨迹,从而量化局部序列对启动子活性的贡献。该系数轨迹有助于识别调控元件,并可用于预测基因组中任何子区域的启动子活性。因此,它允许在计算机上对人类基因组中的任何启动子进行分析。我们开发了一个网络应用程序,可在 cissector.nki.nl 上获得,它允许研究人员轻松地进行此分析,作为他们对任何感兴趣的启动子进行研究的起点。