Dror Iris, Golan Tamar, Levy Carmit, Rohs Remo, Mandel-Gutfreund Yael
Faculty of Biology, Technion-Israel Institute of Technology, Technion City, Haifa 32000, Israel; Molecular and Computational Biology Program, Departments of Biological Sciences, Chemistry, Physics, and Computer Science, University of Southern California, Los Angeles, California 90089, USA;
Department of Human Genetics and Biochemistry, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv 69978, Israel.
Genome Res. 2015 Sep;25(9):1268-80. doi: 10.1101/gr.184671.114. Epub 2015 Jul 9.
Transcriptional regulation requires the binding of transcription factors (TFs) to short sequence-specific DNA motifs, usually located at the gene regulatory regions. Interestingly, based on a vast amount of data accumulated from genomic assays, it has been shown that only a small fraction of all potential binding sites containing the consensus motif of a given TF actually bind the protein. Recent in vitro binding assays, which exclude the effects of the cellular environment, also demonstrate selective TF binding. An intriguing conjecture is that the surroundings of cognate binding sites have unique characteristics that distinguish them from other sequences containing a similar motif that are not bound by the TF. To test this hypothesis, we conducted a comprehensive analysis of the sequence and DNA shape features surrounding the core-binding sites of 239 and 56 TFs extracted from in vitro HT-SELEX binding assays and in vivo ChIP-seq data, respectively. Comparing the nucleotide content of the regions around the TF-bound sites to the counterpart unbound regions containing the same consensus motifs revealed significant differences that extend far beyond the core-binding site. Specifically, the environment of the bound motifs demonstrated unique sequence compositions, DNA shape features, and overall high similarity to the core-binding motif. Notably, the regions around the binding sites of TFs that belong to the same TF families exhibited similar features, with high agreement between the in vitro and in vivo data sets. We propose that these unique features assist in guiding TFs to their cognate binding sites.
转录调控需要转录因子(TFs)与短的序列特异性DNA基序结合,这些基序通常位于基因调控区域。有趣的是,基于从基因组分析中积累的大量数据,已表明在所有包含给定TF共有基序的潜在结合位点中,实际上只有一小部分会结合该蛋白质。最近的体外结合试验排除了细胞环境的影响,也证明了TF的选择性结合。一个有趣的推测是,同源结合位点的周围环境具有独特的特征,使它们与其他含有相似基序但未被TF结合的序列区分开来。为了验证这一假设,我们分别对从体外HT-SELEX结合试验和体内ChIP-seq数据中提取的239个和56个TF的核心结合位点周围的序列和DNA形状特征进行了全面分析。将TF结合位点周围区域的核苷酸含量与含有相同共有基序的未结合对应区域进行比较,发现显著差异远远超出核心结合位点。具体而言,结合基序的环境表现出独特的序列组成、DNA形状特征,并且与核心结合基序总体上高度相似。值得注意的是,属于同一TF家族的TF结合位点周围区域表现出相似的特征,体外和体内数据集之间具有高度一致性。我们提出,这些独特特征有助于引导TFs到达其同源结合位点。