Yang Chia-Chun, Chen Min-Hsuan, Lin Sheng-Yi, Andrews Erik H, Cheng Chao, Liu Chun-Chi, Chen Jeremy J W
Institute of Molecular Biology, National Chung Hsing University, Taichung, Taiwan.
Institute of Genomics and Bioinformatics, National Chung Hsing University, Taichung, Taiwan.
BMC Genomics. 2017 Jan 10;18(1):61. doi: 10.1186/s12864-016-3450-3.
Transcription factors (TFs) often interact with one another to form TF complexes that bind DNA and regulate gene expression. Many databases are created to describe known TF complexes identified by either mammalian two-hybrid experiments or data mining. Lately, a wealth of ChIP-seq data on human TFs under different experiment conditions are available, making it possible to investigate condition-specific (cell type and/or physiologic state) TF complexes and their target genes.
Here, we developed a systematic pipeline to infer Condition-Specific Targets of human TF-TF complexes (called the CST pipeline) by integrating ChIP-seq data and TF motifs. In total, we predicted 2,392 TF complexes and 13,504 high-confidence or 127,994 low-confidence regulatory interactions amongst TF complexes and their target genes. We validated our predictions by (i) comparing predicted TF complexes to external TF complex databases, (ii) validating selected target genes of TF complexes using ChIP-qPCR and RT-PCR experiments, and (iii) analysing target genes of select TF complexes using gene ontology enrichment to demonstrate the accuracy of our work. Finally, the predicted results above were integrated and employed to construct a CST database.
We built up a methodology to construct the CST database, which contributes to the analysis of transcriptional regulation and the identification of novel TF-TF complex formation in a certain condition. This database also allows users to visualize condition-specific TF regulatory networks through a user-friendly web interface.
转录因子(TFs)常常相互作用形成TF复合物,这些复合物结合DNA并调控基因表达。许多数据库已被创建,用于描述通过哺乳动物双杂交实验或数据挖掘鉴定出的已知TF复合物。最近,有大量关于不同实验条件下人类TFs的ChIP-seq数据,这使得研究特定条件(细胞类型和/或生理状态)下的TF复合物及其靶基因成为可能。
在此,我们开发了一个系统流程,通过整合ChIP-seq数据和TF基序来推断人类TF-TF复合物的特定条件靶标(称为CST流程)。我们总共预测了2392个TF复合物以及TF复合物与其靶基因之间的13504个高置信度或127994个低置信度调控相互作用。我们通过以下方式验证了我们的预测:(i)将预测的TF复合物与外部TF复合物数据库进行比较;(ii)使用ChIP-qPCR和RT-PCR实验验证TF复合物的选定靶基因;(iii)使用基因本体富集分析选定TF复合物的靶基因,以证明我们工作的准确性。最后,将上述预测结果整合并用于构建CST数据库。
我们建立了一种构建CST数据库的方法,这有助于转录调控分析以及特定条件下新型TF-TF复合物形成的鉴定。该数据库还允许用户通过用户友好的网页界面可视化特定条件下的TF调控网络。