Ma Wenxiu, Wong Wing Hung
Department of Computer Science, Stanford University, Stanford, California, USA.
Methods Enzymol. 2011;497:51-73. doi: 10.1016/B978-0-12-385075-1.00003-2.
Chromatin immunoprecipitation coupled with ultra-high-throug put parallel DNA sequencing (ChIP-seq) is an effective technology for the investigation of genome-wide protein-DNA interactions. Examples of applications include the studies of RNA polymerases transcription, transcriptional regulation, and histone modifications. The technology provides accurate and high-resolution mapping of the protein-DNA binding loci that are important in the understanding of many processes in development and diseases. Since the introduction of ChIP-seq experiments in 2007, many statistical and computational methods have been developed to support the analysis of the massive datasets from these experiments. However, because of the complex, multistaged analysis workflow, it is still difficult for an experimental investigator to conduct the analysis of his or her own ChIP-seq data. In this chapter, we review the basic design of ChIP-seq experiments and provide an in-depth tutorial on how to prepare, to preprocess, and to analyze ChIP-seq datasets. The tutorial is based on a revised version of our software package CisGenome, which was designed to encompass most standard tasks in ChIP-seq data analysis. Relevant statistical and computational issues will be highlighted, discussed, and illustrated by means of real data examples.
染色质免疫沉淀结合超高通量平行DNA测序(ChIP-seq)是一种用于研究全基因组蛋白质-DNA相互作用的有效技术。其应用实例包括对RNA聚合酶转录、转录调控和组蛋白修饰的研究。该技术能提供蛋白质-DNA结合位点的精确且高分辨率图谱,这对于理解发育和疾病中的许多过程至关重要。自2007年引入ChIP-seq实验以来,已开发出许多统计和计算方法来支持对这些实验产生的海量数据集的分析。然而,由于分析工作流程复杂且分多阶段,实验研究人员对自己的ChIP-seq数据进行分析仍有困难。在本章中,我们将回顾ChIP-seq实验的基本设计,并提供关于如何准备、预处理和分析ChIP-seq数据集的深入教程。该教程基于我们的软件包CisGenome的修订版,其设计涵盖了ChIP-seq数据分析中的大多数标准任务。相关的统计和计算问题将通过实际数据示例进行重点介绍、讨论和说明。