WACS：通过最优加权对照来提高 ChIP-seq 峰调用。

WACS: improving ChIP-seq peak calling by optimally weighting controls.

机构信息

School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, K1N6N5, Canada.

Regenerative Medicine Program, Ottawa Hospital Research Institute, Ottawa, K1H8L6, Canada.

出版信息

BMC Bioinformatics. 2021 Feb 15;22(1):69. doi: 10.1186/s12859-020-03927-2.

DOI:10.1186/s12859-020-03927-2

PMID:33588754

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7885521/

Abstract

BACKGROUND

Chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq), initially introduced more than a decade ago, is widely used by the scientific community to detect protein/DNA binding and histone modifications across the genome. Every experiment is prone to noise and bias, and ChIP-seq experiments are no exception. To alleviate bias, the incorporation of control datasets in ChIP-seq analysis is an essential step. The controls are used to account for the background signal, while the remainder of the ChIP-seq signal captures true binding or histone modification. However, a recurrent issue is different types of bias in different ChIP-seq experiments. Depending on which controls are used, different aspects of ChIP-seq bias are better or worse accounted for, and peak calling can produce different results for the same ChIP-seq experiment. Consequently, generating "smart" controls, which model the non-signal effect for a specific ChIP-seq experiment, could enhance contrast and increase the reliability and reproducibility of the results.

RESULT

We propose a peak calling algorithm, Weighted Analysis of ChIP-seq (WACS), which is an extension of the well-known peak caller MACS2. There are two main steps in WACS: First, weights are estimated for each control using non-negative least squares regression. The goal is to customize controls to model the noise distribution for each ChIP-seq experiment. This is then followed by peak calling. We demonstrate that WACS significantly outperforms MACS2 and AIControl, another recent algorithm for generating smart controls, in the detection of enriched regions along the genome, in terms of motif enrichment and reproducibility analyses.

CONCLUSIONS

This ultimately improves our understanding of ChIP-seq controls and their biases, and shows that WACS results in a better approximation of the noise distribution in controls.

摘要

背景

染色质免疫沉淀结合高通量测序（ChIP-seq）最初于十多年前推出，现已被科学界广泛用于检测整个基因组中的蛋白质/DNA 结合和组蛋白修饰。每个实验都容易受到噪声和偏差的影响，ChIP-seq 实验也不例外。为了减轻偏差，在 ChIP-seq 分析中纳入对照数据集是必不可少的步骤。对照用于解释背景信号，而 ChIP-seq 信号的其余部分则捕获真实的结合或组蛋白修饰。然而，一个反复出现的问题是不同的 ChIP-seq 实验存在不同类型的偏差。根据所使用的对照，ChIP-seq 偏差的不同方面可以得到更好或更差的解释，峰调用可能会对同一个 ChIP-seq 实验产生不同的结果。因此，生成“智能”对照，可以为特定的 ChIP-seq 实验模拟非信号效应，从而增强对比度，并提高结果的可靠性和可重复性。

结果

我们提出了一种峰调用算法，加权 ChIP-seq 分析（WACS），它是著名的峰调用器 MACS2 的扩展。WACS 有两个主要步骤：首先，使用非负最小二乘回归为每个对照估计权重。目标是定制对照以模拟每个 ChIP-seq 实验的噪声分布。然后进行峰调用。我们证明，在检测基因组上富集区域方面，WACS 在基于基序富集和重现性分析的算法方面，显著优于 MACS2 和另一个最近用于生成智能对照的算法 AIControl。

结论

这最终改善了我们对 ChIP-seq 对照及其偏差的理解，并表明 WACS 可以更好地逼近对照中的噪声分布。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f86c/7885521/f3b7ac3ee9e1/12859_2020_3927_Fig1_HTML.jpg

相似文献

WACS: improving ChIP-seq peak calling by optimally weighting controls.WACS：通过最优加权对照来提高 ChIP-seq 峰调用。

BMC Bioinformatics. 2021 Feb 15;22(1):69. doi: 10.1186/s12859-020-03927-2.

RECAP reveals the true statistical significance of ChIP-seq peak calls.RECAP 揭示了 ChIP-seq 峰调用的真实统计意义。

Bioinformatics. 2019 Oct 1;35(19):3592-3598. doi: 10.1093/bioinformatics/btz150.

Unified Analysis of Multiple ChIP-Seq Datasets.多个 ChIP-Seq 数据集的统一分析。

Methods Mol Biol. 2021;2198:451-465. doi: 10.1007/978-1-0716-0876-0_33.

Comparative analysis of commonly used peak calling programs for ChIP-Seq analysis.用于ChIP-Seq分析的常用峰检测程序的比较分析。

Genomics Inform. 2020 Dec;18(4):e42. doi: 10.5808/GI.2020.18.4.e42. Epub 2020 Dec 14.

ChIP-R: Assembling reproducible sets of ChIP-seq and ATAC-seq peaks from multiple replicates.ChIP-R：从多个重复样本中组装可重复的ChIP-seq和ATAC-seq峰集。

Genomics. 2021 Jul;113(4):1855-1866. doi: 10.1016/j.ygeno.2021.04.026. Epub 2021 Apr 18.

Theoretical characterisation of strand cross-correlation in ChIP-seq.ChIP-seq 中链交叉关联的理论特征描述。

BMC Bioinformatics. 2020 Sep 22;21(1):417. doi: 10.1186/s12859-020-03729-6.

Bioinformatics Methods for ChIP-seq Histone Analysis.生物信息学方法在 ChIP-seq 组蛋白分析中的应用。

Methods Mol Biol. 2022;2529:267-293. doi: 10.1007/978-1-0716-2481-4_13.

ChIP-chip versus ChIP-seq: lessons for experimental design and data analysis.ChIP-chip 与 ChIP-seq：实验设计和数据分析的经验教训。

BMC Genomics. 2011 Feb 28;12:134. doi: 10.1186/1471-2164-12-134.

Normalization, bias correction, and peak calling for ChIP-seq.ChIP-seq的标准化、偏差校正和峰检测

Stat Appl Genet Mol Biol. 2012 Mar 31;11(3):Article 9. doi: 10.1515/1544-6115.1750.

Identification of factors associated with duplicate rate in ChIP-seq data.鉴定与 ChIP-seq 数据中重复率相关的因素。

PLoS One. 2019 Apr 3;14(4):e0214723. doi: 10.1371/journal.pone.0214723. eCollection 2019.

引用本文的文献

Benchmarking transcription factor binding site prediction models: a comparative analysis on synthetic and biological data.基准测试转录因子结合位点预测模型：对合成数据和生物数据的比较分析

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf363.

A SlRBP1- module regulates fruit size in tomato.一个SlRBP1模块调控番茄果实大小。

Hortic Res. 2025 Mar 18;12(7):uhaf089. doi: 10.1093/hr/uhaf089. eCollection 2025 Jul.

Transcriptional regulation by PHGDH drives amyloid pathology in Alzheimer's disease.PHGDH介导的转录调控驱动阿尔茨海默病中的淀粉样蛋白病理过程。

Cell. 2025 Jun 26;188(13):3513-3529.e26. doi: 10.1016/j.cell.2025.03.045. Epub 2025 Apr 23.

CATA: a comprehensive chromatin accessibility database for cancer.CATA：一个全面的癌症染色质可及性数据库。

Database (Oxford). 2020 Jan 17;2022. doi: 10.1093/database/baab085.

F-Seq2: improving the feature density based peak caller with dynamic statistics.F-Seq2：利用动态统计改进基于特征密度的峰检测工具

NAR Genom Bioinform. 2021 Feb 23;3(1):lqab012. doi: 10.1093/nargab/lqab012. eCollection 2021 Mar.

本文引用的文献

JASPAR 2020: update of the open-access database of transcription factor binding profiles.JASPAR 2020：转录因子结合谱开放获取数据库的更新。

Nucleic Acids Res. 2020 Jan 8;48(D1):D87-D92. doi: 10.1093/nar/gkz1001.

AIControl: replacing matched control experiments with machine learning improves ChIP-seq peak identification.AIControl：用机器学习替代匹配对照实验可提高 ChIP-seq 峰识别。

Nucleic Acids Res. 2019 Jun 4;47(10):e58. doi: 10.1093/nar/gkz156.

Umap and Bismap: quantifying genome and methylome mappability.Umap 和 Bismap：量化基因组和甲基组的可映射性。

Nucleic Acids Res. 2018 Nov 16;46(20):e120. doi: 10.1093/nar/gky677.

Accounting for GC-content bias reduces systematic errors and batch effects in ChIP-seq data.考虑 GC 含量偏倚可减少 ChIP-seq 数据中的系统误差和批次效应。

Genome Res. 2017 Nov;27(11):1930-1938. doi: 10.1101/gr.220673.117. Epub 2017 Oct 12.

Features that define the best ChIP-seq peak calling algorithms.定义最佳ChIP-seq峰检测算法的特征。

Brief Bioinform. 2017 May 1;18(3):441-450. doi: 10.1093/bib/bbw035.

Recent advances in ChIP-seq analysis: from quality management to whole-genome annotation.染色质免疫沉淀测序（ChIP-seq）分析的最新进展：从质量管理到全基因组注释

Brief Bioinform. 2017 Mar 1;18(2):279-290. doi: 10.1093/bib/bbw023.

BIDCHIPS: bias decomposition and removal from ChIP-seq data clarifies true binding signal and its functional correlates.BIDCHIPS：从ChIP-seq数据中进行偏差分解和去除，可阐明真实的结合信号及其功能相关性。

Epigenetics Chromatin. 2015 Sep 17;8:33. doi: 10.1186/s13072-015-0028-2. eCollection 2015.

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.使用DESeq2对RNA测序数据的倍数变化和离散度进行适度估计。

Genome Biol. 2014;15(12):550. doi: 10.1186/s13059-014-0550-8.

MUSIC: identification of enriched regions in ChIP-Seq experiments using a mappability-corrected multiscale signal processing framework.MUSIC：使用映射性校正的多尺度信号处理框架在ChIP-Seq实验中鉴定富集区域。

Genome Biol. 2014;15(10):474. doi: 10.1186/s13059-014-0474-3.

Identifying and mitigating bias in next-generation sequencing methods for chromatin biology.鉴定和减轻染色质生物学中下一代测序方法的偏倚。

Nat Rev Genet. 2014 Nov;15(11):709-21. doi: 10.1038/nrg3788. Epub 2014 Sep 16.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

WACS：通过最优加权对照来提高 ChIP-seq 峰调用。

WACS: improving ChIP-seq peak calling by optimally weighting controls.

机构信息

出版信息

BACKGROUND

RESULT

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献