Fu Audrey Qiuyan, Adryan Boris
Cambridge Systems Biology Centre, University of Cambridge, Tennis Court Road, Cambridge, UK.
Mol Biosyst. 2009 Dec;5(12):1429-38. doi: 10.1039/B906880e. Epub 2009 Aug 11.
Much of the research utilising genome-wide ChIP and DamID assays aims to understand the combinatorial feature of transcription factor binding and the chromatin modification code. With these experimental methods becoming more affordable and widespread, the focus of research is shifting to making sense of the data. Amongst the many challenges arising from data analyses, we are concerned with identifying biologically meaningful co-occurrences of transcription factor binding or chromatin modifications, using genome-wide profiles generated from ChIP and DamID assays. Co-occurrences are reflected in overlapping and adjacent signals in multiple ChIP or DamID profiles. We review existing quantitative methods to score overlaps and to cluster binding events in ChIP and DamID profiles. For pairwise comparison, existing methods either are based on a single score at the genome level or take a genomic, region-specific view. To draw inference from many profiles simultaneously, methods exist to cluster regions by their regulatory importance or to infer cis-regulatory modules for a particular region. We provide a simple guide to some of the statistical tools used by these methods.
许多利用全基因组染色质免疫沉淀(ChIP)和DNA腺嘌呤甲基转移酶识别(DamID)分析的研究旨在了解转录因子结合的组合特征和染色质修饰密码。随着这些实验方法变得更加经济实惠且广泛应用,研究重点正转向理解这些数据。在数据分析产生的众多挑战中,我们关注利用ChIP和DamID分析生成的全基因组图谱来识别转录因子结合或染色质修饰在生物学上有意义的共现情况。共现情况反映在多个ChIP或DamID图谱中的重叠和相邻信号中。我们回顾了现有的定量方法,用于对ChIP和DamID图谱中的重叠进行评分以及对结合事件进行聚类。对于成对比较,现有方法要么基于基因组水平的单个分数,要么采用基因组区域特异性的观点。为了同时从多个图谱中得出推论,存在一些方法可以根据区域的调控重要性对区域进行聚类,或者为特定区域推断顺式调控模块。我们为这些方法所使用的一些统计工具提供了一个简单指南。