Computational and Systems Biology, Genome Institute of Singapore, Singapore 138672, Singapore.
Genome Res. 2013 Aug;23(8):1307-18. doi: 10.1101/gr.154922.113. Epub 2013 Apr 3.
The binding of transcription factors (TFs) to their specific motifs in genomic regulatory regions is commonly studied in isolation. However, in order to elucidate the mechanisms of transcriptional regulation, it is essential to determine which TFs bind DNA cooperatively as dimers and to infer the precise nature of these interactions. So far, only a small number of such dimeric complexes are known. Here, we present an algorithm for predicting cell-type-specific TF-TF dimerization on DNA on a large scale, using DNase I hypersensitivity data from 78 human cell lines. We represented the universe of possible TF complexes by their corresponding motif complexes, and analyzed their occurrence at cell-type-specific DNase I hypersensitive sites. Based on ∼1.4 billion tests for motif complex enrichment, we predicted 603 highly significant cell-type-specific TF dimers, the vast majority of which are novel. Our predictions included 76% (19/25) of the known dimeric complexes and showed significant overlap with an experimental database of protein-protein interactions. They were also independently supported by evolutionary conservation, as well as quantitative variation in DNase I digestion patterns. Notably, the known and predicted TF dimers were almost always highly compact and rigidly spaced, suggesting that TFs dimerize in close proximity to their partners, which results in strict constraints on the structure of the DNA-bound complex. Overall, our results indicate that chromatin openness profiles are highly predictive of cell-type-specific TF-TF interactions. Moreover, cooperative TF dimerization seems to be a widespread phenomenon, with multiple TF complexes predicted in most cell types.
转录因子(TFs)与基因组调控区域中特定基序的结合通常是孤立研究的。然而,为了阐明转录调控的机制,确定哪些 TF 以二聚体的形式协同结合 DNA 并推断这些相互作用的精确性质是至关重要的。到目前为止,只有少数这样的二聚体复合物是已知的。在这里,我们提出了一种算法,用于根据 78 个人类细胞系的 DNase I 超敏数据,大规模预测细胞类型特异性的 TF-TF 二聚体在 DNA 上的结合。我们通过相应的基序复合物来表示可能的 TF 复合物的宇宙,并分析它们在细胞类型特异性的 DNase I 超敏位点的出现情况。基于对基序复合物富集的约 14 亿次测试,我们预测了 603 个高度显著的细胞类型特异性 TF 二聚体,其中绝大多数是新的。我们的预测包括 76%(19/25)的已知二聚体复合物,并与蛋白质-蛋白质相互作用的实验数据库有显著重叠。它们还得到了进化保守性和 DNase I 消化模式的定量变化的独立支持。值得注意的是,已知和预测的 TF 二聚体几乎总是高度紧凑和刚性间隔的,这表明 TF 以其伴侣的紧密接近方式二聚化,这导致 DNA 结合复合物的结构受到严格限制。总体而言,我们的结果表明,染色质开放性谱高度预测细胞类型特异性的 TF-TF 相互作用。此外,协同 TF 二聚化似乎是一种普遍现象,在大多数细胞类型中预测到多个 TF 复合物。