Suppr超能文献

用于CUT&RUN的峰值检测方法基准测试

Benchmarking peak calling methods for CUT&RUN.

作者信息

Nooranikhojasteh Amin, Tavallaee Ghazaleh, Orouji Elias

机构信息

Princess Margaret Cancer Centre, University Health Network (UHN), Toronto, ON M5G 1L7, Canada.

出版信息

Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf375.

Abstract

MOTIVATION

Cleavage Under Targets and Release Using Nuclease (CUT&RUN) has rapidly gained prominence as an effective approach for mapping protein-DNA interactions, especially histone modifications, offering substantial improvements over conventional chromatin immunoprecipitation sequencing (ChIP-seq). However, the effectiveness of this technique is contingent upon accurate peak identification, necessitating the use of optimal peak calling methods tailored to the unique characteristics of CUT&RUN data.

RESULTS

Here, we benchmark four prominent peak calling tools, MACS2, SEACR, GoPeaks, and LanceOtron, evaluating their performance in identifying peaks from CUT&RUN datasets. Our analysis utilizes in-house data of three histone marks (H3K4me3, H3K27ac, and H3K27me3) from mouse brain tissue, as well as samples from the 4D Nucleome database. We systematically assess these tools based on parameters such as the number of peaks called, peak length distribution, signal enrichment, and reproducibility across biological replicates. Our findings reveal substantial variability in peak calling efficacy, with each method demonstrating distinct strengths in sensitivity, precision, and applicability depending on the histone mark in question. These insights provide a comprehensive evaluation that will assist in selecting the most suitable peak caller for high-confidence identification of regions of interest in CUT&RUN experiments, ultimately enhancing the study of chromatin dynamics and transcriptional regulation.

AVAILABILITY AND IMPLEMENTATION

The CUT&RUN data generated in this study have been deposited in the Gene Expression Omnibus (GEO) under the accession number GSE282809. All the 4D Nucleome datasets can be obtained from the 4D Nucleome Data Portal (https://data.4dnucleome.org/). All scripts used for data processing, figure generation, and analysis are available in the following GitHub repository: https://github.com/OroujiLab/CUTandRun_Peak_Calling/, and have also been archived on Zenodo.

摘要

动机

靶向切割与核酸酶释放技术(CUT&RUN)作为一种绘制蛋白质-DNA相互作用(尤其是组蛋白修饰)的有效方法迅速崭露头角,相较于传统的染色质免疫沉淀测序(ChIP-seq)有显著改进。然而,该技术的有效性取决于准确的峰识别,这需要使用针对CUT&RUN数据独特特征量身定制的最佳峰检测方法。

结果

在此,我们对四种著名的峰检测工具MACS2、SEACR、GoPeaks和LanceOtron进行基准测试,评估它们从CUT&RUN数据集中识别峰的性能。我们的分析利用了来自小鼠脑组织的三种组蛋白标记(H3K4me3、H3K27ac和H3K27me3)的内部数据以及4D核体数据库中的样本。我们基于诸如检测到的峰数量、峰长度分布、信号富集以及生物重复间的可重复性等参数系统地评估这些工具。我们的发现揭示了峰检测效率存在显著差异,每种方法根据所涉及的组蛋白标记在灵敏度、精度和适用性方面展现出不同的优势。这些见解提供了全面的评估,将有助于选择最合适的峰检测工具,以在CUT&RUN实验中高置信度地识别感兴趣区域,最终加强对染色质动力学和转录调控的研究。

可用性与实现方式

本研究中生成的CUT&RUN数据已存入基因表达综合数据库(GEO),登录号为GSE282809。所有4D核体数据集可从4D核体数据门户(https://data.4dnucleome.org/)获取。用于数据处理、图形生成和分析的所有脚本可在以下GitHub仓库获取:https://github.com/OroujiLab/CUTandRun_Peak_Calling/,并且也已存档于Zenodo。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验