Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133 Milan, Italy.
Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133 Milan, Italy.
Genomics Proteomics Bioinformatics. 2018 Oct;16(5):342-353. doi: 10.1016/j.gpb.2018.05.004. Epub 2018 Dec 19.
Transcriptional regulation is critical to cellular processes of all organisms. Regulatory mechanisms often involve more than one transcription factor (TF) from different families, binding together and attaching to the DNA as a single complex. However, only a fraction of the regulatory partners of each TF is currently known. In this paper, we present the Transcriptional Interaction and Coregulation Analyzer (TICA), a novel methodology for predicting heterotypic physical interaction of TFs. TICA employs a data-driven approach to infer interaction phenomena from chromatin immunoprecipitation and sequencing (ChIP-seq) data. Its prediction rules are based on the distribution of minimal distance couples of paired binding sites belonging to different TFs which are located closest to each other in promoter regions. Notably, TICA uses only binding site information from input ChIP-seq experiments, bypassing the need to do motif calling on sequencing data. We present our method and test it on ENCODE ChIP-seq datasets, using three cell lines as reference including HepG2, GM12878, and K562. TICA positive predictions on ENCODE ChIP-seq data are strongly enriched when compared to protein complex (CORUM) and functional interaction (BioGRID) databases. We also compare TICA against both motif/ChIP-seq based methods for physical TF-TF interaction prediction and published literature. Based on our results, TICA offers significant specificity (average 0.902) while maintaining a good recall (average 0.284) with respect to CORUM, providing a novel technique for fast analysis of regulatory effect in cell lines. Furthermore, predictions by TICA are complementary to other methods for TF-TF interaction prediction (in particular, TACO and CENTDIST). Thus, combined application of these prediction tools results in much improved sensitivity in detecting TF-TF interactions compared to TICA alone (sensitivity of 0.526 when combining TICA with TACO and 0.585 when combining with CENTDIST) with little compromise in specificity (specificity 0.760 when combining with TACO and 0.643 with CENTDIST). TICA is publicly available at http://geco.deib.polimi.it/tica/.
转录调控对于所有生物体的细胞过程都至关重要。调节机制通常涉及来自不同家族的多个转录因子 (TF),它们结合在一起并附着在 DNA 上形成一个单一的复合物。然而,目前每个 TF 的调节伙伴只有一小部分是已知的。在本文中,我们提出了转录相互作用和共调控分析器 (TICA),这是一种用于预测 TF 异型物理相互作用的新方法。TICA 采用一种数据驱动的方法,从染色质免疫沉淀和测序 (ChIP-seq) 数据中推断相互作用现象。它的预测规则基于属于不同 TF 的配对结合位点的最小距离对的分布,这些位点在启动子区域中彼此最接近。值得注意的是,TICA 仅使用输入 ChIP-seq 实验中的结合位点信息,而无需在测序数据上进行基序调用。我们提出了我们的方法,并在 ENCODE ChIP-seq 数据集上进行了测试,使用了 HepG2、GM12878 和 K562 三种细胞系作为参考。与蛋白复合物 (CORUM) 和功能相互作用 (BioGRID) 数据库相比,TICA 对 ENCODE ChIP-seq 数据的阳性预测具有很强的富集性。我们还将 TICA 与基于基序/ChIP-seq 的物理 TF-TF 相互作用预测方法以及已发表的文献进行了比较。根据我们的结果,TICA 提供了显著的特异性(平均 0.902),同时保持了与 CORUM 相当的召回率(平均 0.284),为细胞系中快速分析调节效应提供了一种新的技术。此外,TICA 的预测结果与其他 TF-TF 相互作用预测方法(特别是 TACO 和 CENTDIST)互补。因此,与 TICA 单独使用相比,这些预测工具的联合应用可显著提高 TF-TF 相互作用的检测灵敏度(与 TACO 结合时的灵敏度为 0.526,与 CENTDIST 结合时的灵敏度为 0.585),而特异性几乎没有损失(与 TACO 结合时的特异性为 0.760,与 CENTDIST 结合时的特异性为 0.643)。TICA 可在 http://geco.deib.polimi.it/tica/ 上公开获取。