Charles Perkins Centre and School of Mathematics and Statistics, University of Sydney, Camperdown, NSW 2006, Australia.
Systems Biology Group, Epigenetics & Stem Cell Biology Laboratory, National Institute of Environmental Health Sciences, National Institutes of Health, RTP, NC 27709, USA.
Bioinformatics. 2017 Jul 1;33(13):1916-1920. doi: 10.1093/bioinformatics/btx092.
DNA binding proteins such as chromatin remodellers, transcription factors (TFs), histone modifiers and co-factors often bind cooperatively to activate or repress their target genes in a cell type-specific manner. Nonetheless, the precise role of cooperative binding in defining cell-type identity is still largely uncharacterized.
Here, we collected and analyzed 214 public datasets representing chromatin immunoprecipitation followed by sequencing (ChIP-Seq) of 104 DNA binding proteins in embryonic stem cell (ESC) lines. We classified their binding sites into those proximal to gene promoters and those in distal regions, and developed a web resource called Proximal And Distal (PAD) clustering to identify their co-localization at these respective regions. Using this extensive dataset, we discovered an extensive co-localization of BRG1 and CHD7 at distal but not proximal regions. The comparison of co-localization sites to those bound by either BRG1 or CHD7 alone showed an enrichment of ESC master TFs binding and active chromatin architecture at co-localization sites. Most notably, our analysis reveals the co-dependency of BRG1 and CHD7 at distal regions on regulating expression of their common target genes in ESC. This work sheds light on cooperative binding of TF binding proteins in regulating gene expression in ESC, and demonstrates the utility of integrative analysis of a manually curated compendium of genome-wide protein binding profiles in our online resource PAD.
PAD is freely available at http://pad.victorchang.edu.au/ and its source code is available via an open source GPL 3.0 license at https://github.com/VCCRI/PAD/.
pengyi.yang@sydney.edu.au or j.ho@victorchang.edu.au.
Supplementary data are available at Bioinformatics online.
DNA 结合蛋白,如染色质重塑因子、转录因子 (TFs)、组蛋白修饰因子和共因子,通常以细胞类型特异性的方式协同结合以激活或抑制其靶基因。尽管如此,协同结合在定义细胞类型身份中的精确作用在很大程度上仍未得到充分描述。
在这里,我们收集和分析了 214 个公共数据集,这些数据集代表了 104 种 DNA 结合蛋白在胚胎干细胞 (ESC) 系中的染色质免疫沉淀测序 (ChIP-Seq)。我们将它们的结合位点分为靠近基因启动子的和位于远端的,并开发了一个名为 Proximal And Distal (PAD) 聚类的网络资源,以识别它们在这些相应区域的共定位。使用这个广泛的数据集,我们发现 BRG1 和 CHD7 在远端而非近端区域广泛共定位。将共定位位点与单独由 BRG1 或 CHD7 结合的位点进行比较,显示出 ESC 主 TF 结合和活跃染色质结构在共定位位点的富集。最值得注意的是,我们的分析揭示了 BRG1 和 CHD7 在远端区域对共同靶基因表达的协同依赖性,在 ESC 中调节基因表达。这项工作揭示了 TF 结合蛋白在调节 ESC 中基因表达的协同结合,并且展示了在我们的在线资源 PAD 中整合分析手动整理的全基因组蛋白结合图谱的综合分析的实用性。
PAD 可在 http://pad.victorchang.edu.au/ 免费获得,其源代码可通过开源 GPL 3.0 许可证在 https://github.com/VCCRI/PAD/ 获得。
pengyi.yang@sydney.edu.au 或 j.ho@victorchang.edu.au。
补充数据可在《生物信息学》在线获得。