Department of Pharmacology and Systems Therapeutics, Systems Biology Center New York (SBCNY), Mount Sinai School of Medicine, One Gustave Levy Place, New York, NY 10029, USA.
Bioinformatics. 2010 Oct 1;26(19):2438-44. doi: 10.1093/bioinformatics/btq466. Epub 2010 Aug 13.
Experiments such as ChIP-chip, ChIP-seq, ChIP-PET and DamID (the four methods referred herein as ChIP-X) are used to profile the binding of transcription factors to DNA at a genome-wide scale. Such experiments provide hundreds to thousands of potential binding sites for a given transcription factor in proximity to gene coding regions.
In order to integrate data from such studies and utilize it for further biological discovery, we collected interactions from such experiments to construct a mammalian ChIP-X database. The database contains 189,933 interactions, manually extracted from 87 publications, describing the binding of 92 transcription factors to 31,932 target genes. We used the database to analyze mRNA expression data where we perform gene-list enrichment analysis using the ChIP-X database as the prior biological knowledge gene-list library. The system is delivered as a web-based interactive application called ChIP Enrichment Analysis (ChEA). With ChEA, users can input lists of mammalian gene symbols for which the program computes over-representation of transcription factor targets from the ChIP-X database. The ChEA database allowed us to reconstruct an initial network of transcription factors connected based on shared overlapping targets and binding site proximity. To demonstrate the utility of ChEA we present three case studies. We show how by combining the Connectivity Map (CMAP) with ChEA, we can rank pairs of compounds to be used to target specific transcription factor activity in cancer cells.
The ChEA software and ChIP-X database is freely available online at: http://amp.pharm.mssm.edu/lib/chea.jsp.
ChIP-chip、ChIP-seq、ChIP-PET 和 DamID(以下简称 ChIP-X)等实验用于在全基因组范围内绘制转录因子与 DNA 的结合图谱。这些实验为给定的转录因子在临近基因编码区的位置提供了数百到数千个潜在的结合位点。
为了整合这些研究的数据并将其用于进一步的生物学发现,我们从这些实验中收集了相互作用,构建了哺乳动物 ChIP-X 数据库。该数据库包含 189933 个相互作用,从 87 篇文献中手动提取,描述了 92 个转录因子与 31932 个靶基因的结合。我们使用该数据库分析了 mRNA 表达数据,我们使用 ChIP-X 数据库作为先验生物学知识基因列表库进行基因列表富集分析。该系统作为一个名为 ChIP 富集分析(ChEA)的基于网络的交互式应用程序交付。使用 ChEA,用户可以输入哺乳动物基因符号列表,程序会从 ChIP-X 数据库中计算转录因子靶基因的过度表达。ChEA 数据库使我们能够根据共享重叠靶标和结合位点接近度重建转录因子的初始网络。为了演示 ChEA 的实用性,我们展示了三个案例研究。我们展示了如何通过将 Connectivity Map (CMAP) 与 ChEA 相结合,对化合物进行排名,以靶向癌细胞中特定的转录因子活性。
ChEA 软件和 ChIP-X 数据库可在以下网址免费在线获得:http://amp.pharm.mssm.edu/lib/chea.jsp。