Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany.
Department of Biostatistics, Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America.
PLoS Comput Biol. 2018 Aug 24;14(8):e1006372. doi: 10.1371/journal.pcbi.1006372. eCollection 2018 Aug.
Cell-type specific gene expression is regulated by the combinatorial action of transcription factors (TFs). In this study, we predict transcription factor (TF) combinations that cooperatively bind in a cell-type specific manner. We first divide DNase hypersensitive sites into cell-type specifically open vs. ubiquitously open sites in 64 cell types to describe possible cell-type specific enhancers. Based on the pattern contrast between these two groups of sequences we develop "co-occurring TF predictor on Cell-Type specific Enhancers" (coTRaCTE) - a novel statistical method to determine regulatory TF co-occurrences. Contrasting the co-binding of TF pairs between cell-type specific and ubiquitously open chromatin guarantees the high cell-type specificity of the predictions. coTRaCTE predicts more than 2000 co-occurring TF pairs in 64 cell types. The large majority (70%) of these TF pairs is highly cell-type specific and overlaps in TF pair co-occurrence are highly consistent among related cell types. Furthermore, independently validated co-occurring and directly interacting TFs are significantly enriched in our predictions. Focusing on the regulatory network derived from the predicted co-occurring TF pairs in embryonic stem cells (ESCs) we find that it consists of three subnetworks with distinct functions: maintenance of pluripotency governed by OCT4, SOX2 and NANOG, regulation of early development governed by KLF4, STAT3, ZIC3 and ZNF148 and general functions governed by MYC, TCF3 and YY1. In summary, coTRaCTE predicts highly cell-type specific co-occurring TFs which reveal new insights into transcriptional regulatory mechanisms.
细胞类型特异性基因表达受转录因子(TFs)的组合作用调控。在本研究中,我们预测了以细胞类型特异性方式协同结合的转录因子(TF)组合。我们首先将 DNase 超敏位点划分为 64 种细胞类型中特异性开放与普遍开放的位点,以描述可能的细胞类型特异性增强子。基于这两组序列之间的模式对比,我们开发了“细胞类型特异性增强子上共同出现的 TF 预测器”(coTRaCTE)——一种新的统计方法来确定调控 TF 的共同出现。对比细胞类型特异性和普遍开放染色质中 TF 对的共同结合可确保预测的高度细胞类型特异性。coTRaCTE 在 64 种细胞类型中预测了超过 2000 个共同出现的 TF 对。其中绝大多数(70%)TF 对具有高度细胞类型特异性,并且在相关细胞类型中 TF 对共同出现的重叠非常一致。此外,经过独立验证的共同出现和直接相互作用的 TF 在我们的预测中显著富集。我们关注从胚胎干细胞(ESCs)中预测的共同出现 TF 对中衍生的调控网络,发现它由三个具有不同功能的子网络组成:由 OCT4、SOX2 和 NANOG 调控的多能性维持,由 KLF4、STAT3、ZIC3 和 ZNF148 调控的早期发育,以及由 MYC、TCF3 和 YY1 调控的一般功能。总之,coTRaCTE 预测了高度细胞类型特异性的共同出现 TF,揭示了转录调控机制的新见解。