Department of Cell and Molecular Biology, Uppsala University, Husargatan 3, Uppsala, Sweden.
BMC Bioinformatics. 2013 Sep 23;14:280. doi: 10.1186/1471-2105-14-280.
Finding peaks in ChIP-seq is an important process in biological inference. In some cases, such as positioning nucleosomes with specific histone modifications or finding transcription factor binding specificities, the precision of the detected peak plays a significant role. There are several applications for finding peaks (called peak finders) based on different algorithms (e.g. MACS, Erange and HPeak). Benchmark studies have shown that the existing peak finders identify different peaks for the same dataset and it is not known which one is the most accurate. We present the first meta-server called Peak Finder MetaServer (PFMS) that collects results from several peak finders and produces consensus peaks. Our application accepts three standard ChIP-seq data formats: BED, BAM, and SAM.
Sensitivity and specificity of seven widely used peak finders were examined. For the experiments we used three previously studied Transcription Factors (TF) ChIP-seq datasets and identified three of the selected peak finders that returned results with high specificity and very good sensitivity compared to the remaining four. We also ran PFMS using the three selected peak finders on the same TF datasets and achieved higher specificity and sensitivity than the peak finders individually.
We show that combining outputs from up to seven peak finders yields better results than individual peak finders. In addition, three of the seven peak finders outperform the remaining four, and running PFMS with these three returns even more accurate results. Another added value of PFMS is a separate report of the peaks returned by each of the included peak finders.
在 ChIP-seq 中寻找峰是生物推断中的一个重要过程。在某些情况下,例如定位具有特定组蛋白修饰的核小体或寻找转录因子结合特异性,检测到的峰的精度起着重要作用。有几种基于不同算法(例如 MACS、Erange 和 HPeak)的寻找峰的应用程序(称为峰查找器)。基准研究表明,现有的峰查找器会为同一数据集识别不同的峰,目前尚不清楚哪个是最准确的。我们提出了第一个元服务器,称为峰查找器元服务器(PFMS),它收集来自几个峰查找器的结果并生成共识峰。我们的应用程序接受三种标准的 ChIP-seq 数据格式:BED、BAM 和 SAM。
我们检查了七个广泛使用的峰查找器的灵敏度和特异性。对于实验,我们使用了三个先前研究过的转录因子(TF)ChIP-seq 数据集,并确定了三个选择的峰查找器,与其余四个相比,它们的特异性高且灵敏度非常好。我们还使用三个选定的峰查找器在相同的 TF 数据集上运行 PFMS,与单独的峰查找器相比,它具有更高的特异性和灵敏度。
我们表明,将多达七个峰查找器的输出组合起来可以获得比单个峰查找器更好的结果。此外,七个峰查找器中的三个优于其余四个,使用这三个峰查找器运行 PFMS 可以获得更准确的结果。PFMS 的另一个附加值是每个包含的峰查找器返回的峰的单独报告。