Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
Bioinformatics. 2023 Dec 1;39(12). doi: 10.1093/bioinformatics/btad725.
Analysis of open chromatin regions across multiple samples from two or more distinct conditions can determine altered gene regulatory patterns associated with biological phenotypes and complex traits. The ATAC-seq assay allows for tractable genome-wide open chromatin profiling of large numbers of samples. Stable, broadly applicable genomic annotations of open chromatin regions are not available. Thus, most studies first identify open regions using peak calling methods for each sample independently. These are then heuristically combined to obtain a consensus peak set. Reconciling sample-specific peak results post hoc from larger cohorts is particularly challenging, and informative spatial features specific to open chromatin signals are not leveraged effectively.
We propose a novel method, ROCCO, that determines consensus open chromatin regions across multiple samples simultaneously. ROCCO employs robust summary statistics and solves a constrained optimization problem formulated to account for both enrichment and spatial dependence of open chromatin signal data. We show this formulation admits attractive theoretical and conceptual properties as well as superior empirical performance compared to current methodology.
Source code, documentation, and usage demos for ROCCO are available on GitHub at: https://github.com/nolan-h-hamilton/ROCCO. ROCCO can also be installed as a stand-alone binary utility using pip/PyPI.
分析来自两种或多种不同条件的多个样本中的开放染色质区域,可以确定与生物表型和复杂特征相关的改变的基因调控模式。ATAC-seq 测定法允许对大量样本进行可处理的全基因组开放染色质分析。开放染色质区域的稳定、广泛适用的基因组注释尚不可用。因此,大多数研究首先使用峰呼叫方法为每个样本独立地识别开放区域。然后通过启发式组合来获得共识峰集。从更大的队列中事后协调样本特定的峰结果特别具有挑战性,并且未有效地利用开放染色质信号的信息空间特征。
我们提出了一种新的方法 ROCCO,它可以同时确定多个样本中的共识开放染色质区域。ROCCO 采用稳健的汇总统计信息,并解决了一个受约束的优化问题,该问题的制定旨在考虑开放染色质信号数据的富集和空间依赖性。与当前方法相比,我们证明了这种表述具有有吸引力的理论和概念特性以及优越的经验性能。
ROCCO 的源代码、文档和使用演示可在 GitHub 上获得:https://github.com/nolan-h-hamilton/ROCCO。ROCCO 也可以使用 pip/PyPI 作为独立的二进制实用程序进行安装。