Department of Electrical Engineering, Stanford University, , Stanford, CA 94305, USA.
Philos Trans R Soc Lond B Biol Sci. 2013 Nov 11;368(1632):20130029. doi: 10.1098/rstb.2013.0029. Print 2013 Dec 19.
Mapping the DNA-binding preferences of transcription factor (TF) complexes is critical for deciphering the functions of cis-regulatory elements. Here, we developed a computational method that compares co-occurring motif spacings in conserved versus unconserved regions of the human genome to detect evolutionarily constrained binding sites of rigid TF complexes. Structural data were used to estimate TF complex physical plausibility, explore overlapping motif arrangements seldom tackled by non-structure-aware methods, and generate and analyse three-dimensional models of the predicted complexes bound to DNA. Using this approach, we predicted 422 physically realistic TF complex motifs at 18% false discovery rate, the majority of which (326, 77%) contain some sequence overlap between binding sites. The set of mostly novel complexes is enriched in known composite motifs, predictive of binding site configurations in TF-TF-DNA crystal structures, and supported by ChIP-seq datasets. Structural modelling revealed three cooperativity mechanisms: direct protein-protein interactions, potentially indirect interactions and 'through-DNA' interactions. Indeed, 38% of the predicted complexes were found to contain four or more bases in which TF pairs appear to synergize through overlapping binding to the same DNA base pairs in opposite grooves or strands. Our TF complex and associated binding site predictions are available as a web resource at http://bejerano.stanford.edu/complex.
绘制转录因子 (TF) 复合物的 DNA 结合偏好对于破译顺式调控元件的功能至关重要。在这里,我们开发了一种计算方法,该方法比较了人类基因组保守区和非保守区中共同出现的基序间隔,以检测刚性 TF 复合物的进化约束结合位点。结构数据用于估计 TF 复合物的物理可行性,探索非结构感知方法很少涉及的重叠基序排列,并生成和分析预测的复合物与 DNA 结合的三维模型。使用这种方法,我们以 18%的假发现率预测了 422 个物理上合理的 TF 复合物基序,其中大多数(326,77%)在结合位点之间存在一些序列重叠。大多数新型复合物的集合富含已知的复合基序,可预测 TF-TF-DNA 晶体结构中的结合位点构型,并得到 ChIP-seq 数据集的支持。结构建模揭示了三种协同作用机制:直接蛋白质-蛋白质相互作用、潜在的间接相互作用和“通过 DNA”相互作用。事实上,预测的复合物中有 38%包含四个或更多碱基,其中 TF 对似乎通过重叠结合相反沟或链上的相同 DNA 碱基对协同作用。我们的 TF 复合物和相关的结合位点预测作为一个网络资源可在 http://bejerano.stanford.edu/complex 获得。