Department of Computer Science, Princeton University, Princeton, 08540 NJ, USA and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, 08544 NJ, USA.
Nucleic Acids Res. 2014 Mar;42(5):2833-47. doi: 10.1093/nar/gkt1302. Epub 2013 Dec 23.
Combinatorial interplay among transcription factors (TFs) is an important mechanism by which transcriptional regulatory specificity is achieved. However, despite the increasing number of TFs for which either binding specificities or genome-wide occupancy data are known, knowledge about cooperativity between TFs remains limited. To address this, we developed a computational framework for predicting genome-wide co-binding between TFs (CCAT, Combinatorial Code Analysis Tool), and applied it to Drosophila melanogaster to uncover cooperativity among TFs during embryo development. Using publicly available TF binding specificity data and DNaseI chromatin accessibility data, we first predicted genome-wide binding sites for 324 TFs across five stages of D. melanogaster embryo development. We then applied CCAT in each of these developmental stages, and identified from 19 to 58 pairs of TFs in each stage whose predicted binding sites are significantly co-localized. We found that nearby binding sites for pairs of TFs predicted to cooperate were enriched in regions bound in relevant ChIP experiments, and were more evolutionarily conserved than other pairs. Further, we found that TFs tend to be co-localized with other TFs in a dynamic manner across developmental stages. All generated data as well as source code for our front-to-end pipeline are available at http://cat.princeton.edu.
转录因子 (TFs) 之间的组合相互作用是实现转录调控特异性的重要机制。然而,尽管越来越多的 TFs 的结合特异性或全基因组占据数据是已知的,但关于 TFs 之间的协同作用的知识仍然有限。为了解决这个问题,我们开发了一种用于预测 TFs 之间全基因组共同结合的计算框架(CCAT,组合编码分析工具),并将其应用于黑腹果蝇中,以揭示胚胎发育过程中 TFs 之间的协同作用。使用公开的 TF 结合特异性数据和 DNaseI 染色质可及性数据,我们首先预测了 324 个 TFs 在黑腹果蝇胚胎发育的五个阶段的全基因组结合位点。然后,我们在每个发育阶段应用 CCAT,并在每个阶段鉴定出 19 到 58 对 TF,其预测的结合位点显著共定位。我们发现,预测协同作用的 TF 对的附近结合位点在相关 ChIP 实验中结合的区域中富集,并且比其他对更具进化保守性。此外,我们发现 TFs 倾向于在跨发育阶段以动态方式与其他 TFs 共定位。所有生成的数据以及我们的端到端管道的源代码都可在 http://cat.princeton.edu 上获得。