Jeon Hyeongrin, Lee Hyunji, Kang Byunghee, Jang Insoon, Roh Tae-Young
Department of Life Sciences, Pohang University of Science and Technology (POSTECH), Pohang 37673, Korea.
Division of Integrative Biosciences and Biotechnology, Pohang University of Science and Technology (POSTECH), Pohang 37673, Korea.
Genomics Inform. 2020 Dec;18(4):e42. doi: 10.5808/GI.2020.18.4.e42. Epub 2020 Dec 14.
Chromatin immunoprecipitation coupled with high-throughput DNA sequencing (ChIP-Seq) is a powerful technology to profile the location of proteins of interest on a whole-genome scale. To identify the enrichment location of proteins, many programs and algorithms have been proposed. However, none of the commonly used peak calling programs could accurately explain the binding features of target proteins detected by ChIP-Seq. Here, publicly available data on 12 histone modifications, including H3K4ac/me1/me2/me3, H3K9ac/me3, H3K27ac/me3, H3K36me3, H3K56ac, and H3K79me1/me2, generated from a human embryonic stem cell line (H1), were profiled with five peak callers (CisGenome, MACS1, MACS2, PeakSeq, and SISSRs). The performance of the peak calling programs was compared in terms of reproducibility between replicates, examination of enriched regions to variable sequencing depths, the specificity-to-noise signal, and sensitivity of peak prediction. There were no major differences among peak callers when analyzing point source histone modifications. The peak calling results from histone modifications with low fidelity, such as H3K4ac, H3K56ac, and H3K79me1/me2, showed low performance in all parameters, which indicates that their peak positions might not be located accurately. Our comparative results could provide a helpful guide to choose a suitable peak calling program for specific histone modifications.
染色质免疫沉淀结合高通量DNA测序(ChIP-Seq)是一种强大的技术,可在全基因组范围内描绘感兴趣蛋白质的位置。为了识别蛋白质的富集位置,人们提出了许多程序和算法。然而,常用的峰检测程序都无法准确解释ChIP-Seq检测到的靶蛋白的结合特征。在这里,使用五种峰检测程序(CisGenome、MACS1、MACS2、PeakSeq和SISSRs)对来自人类胚胎干细胞系(H1)的12种组蛋白修饰的公开可用数据进行了分析,这些修饰包括H3K4ac/me1/me2/me3、H3K9ac/me3、H3K27ac/me3、H3K36me3、H3K56ac以及H3K79me1/me2。从重复样本之间的可重复性、对可变测序深度的富集区域检查、特异性与噪声信号以及峰预测的灵敏度等方面比较了峰检测程序的性能。在分析点源组蛋白修饰时,峰检测程序之间没有重大差异。低保真度组蛋白修饰(如H3K4ac、H3K56ac和H3K79me1/me2)的峰检测结果在所有参数上表现都较差,这表明它们的峰位置可能定位不准确。我们的比较结果可为针对特定组蛋白修饰选择合适的峰检测程序提供有用的指导。