Department of Bioinformatics, School of Life Sciences and Technology, Tongji University, Shanghai, China.
Nat Protoc. 2012 Sep;7(9):1728-40. doi: 10.1038/nprot.2012.101. Epub 2012 Aug 30.
Model-based analysis of ChIP-seq (MACS) is a computational algorithm that identifies genome-wide locations of transcription/chromatin factor binding or histone modification from ChIP-seq data. MACS consists of four steps: removing redundant reads, adjusting read position, calculating peak enrichment and estimating the empirical false discovery rate (FDR). In this protocol, we provide a detailed demonstration of how to install MACS and how to use it to analyze three common types of ChIP-seq data sets with different characteristics: the sequence-specific transcription factor FoxA1, the histone modification mark H3K4me3 with sharp enrichment and the H3K36me3 mark with broad enrichment. We also explain how to interpret and visualize the results of MACS analyses. The algorithm requires ∼3 GB of RAM and 1.5 h of computing time to analyze a ChIP-seq data set containing 30 million reads, an estimate that increases with sequence coverage. MACS is open source and is available from http://liulab.dfci.harvard.edu/MACS/.
基于模型的 ChIP-seq 分析(MACS)是一种计算算法,可从 ChIP-seq 数据中识别转录/染色质因子结合或组蛋白修饰的全基因组位置。MACS 由四个步骤组成:去除冗余读取、调整读取位置、计算峰富集和估计经验性错误发现率(FDR)。在本方案中,我们提供了一个详细的演示,说明如何安装 MACS 以及如何使用它来分析具有不同特征的三种常见类型的 ChIP-seq 数据集:序列特异性转录因子 FoxA1、具有尖锐富集的组蛋白修饰标记 H3K4me3 和具有广泛富集的 H3K36me3 标记。我们还解释了如何解释和可视化 MACS 分析的结果。该算法需要大约 3GB 的 RAM 和 1.5 小时的计算时间来分析包含 3000 万个读取的 ChIP-seq 数据集,估计值随序列覆盖率而增加。MACS 是开源的,可从 http://liulab.dfci.harvard.edu/MACS/ 获取。