Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America.
PLoS Comput Biol. 2023 Oct 20;19(10):e1011568. doi: 10.1371/journal.pcbi.1011568. eCollection 2023 Oct.
Histone ChIP-seq is one of the primary methods for charting the cellular epigenomic landscape, the components of which play a critical regulatory role in gene expression. Analyzing the activity of regulatory elements across datasets and cell types can be challenging due to shifting peak positions and normalization artifacts resulting from, for example, differing read depths, ChIP efficiencies, and target sizes. Moreover, broad regions of enrichment seen in repressive histone marks often evade detection by commonly used peak callers. Here, we present a simple and versatile method for identifying enriched regions in ChIP-seq data that relies on estimating a gamma distribution fit to non-overlapping 5kB genomic bins to establish a global background. We use this distribution to assign a probability of being signal (PBS) between zero and one to each 5 kB bin. This approach, while lower in resolution than typical peak-calling methods, provides a straightforward way to identify enriched regions and compare enrichments among multiple datasets, by transforming the data to values that are universally normalized and can be readily visualized and integrated with downstream analysis methods. We demonstrate applications of PBS for both broad and narrow histone marks, and provide several illustrations of biological insights which can be gleaned by integrating PBS scores with downstream data types.
组蛋白 ChIP-seq 是绘制细胞表观基因组图谱的主要方法之一,其组成部分在基因表达中起着关键的调节作用。由于峰位移动和归一化伪影(例如,不同的读取深度、ChIP 效率和靶标大小),分析不同数据集和细胞类型的调控元件的活性可能具有挑战性。此外,在常见的峰调用器中,抑制性组蛋白标记的广泛富集区域通常难以检测到。在这里,我们提出了一种简单而通用的方法,用于识别 ChIP-seq 数据中的富集区域,该方法依赖于估计非重叠 5kb 基因组 bin 的伽马分布拟合,以建立全局背景。我们使用该分布将每个 5kb bin 的信号概率(PBS)分配在 0 到 1 之间。这种方法的分辨率低于典型的峰调用方法,但提供了一种简单的方法来识别富集区域,并在多个数据集之间比较富集程度,通过将数据转换为普遍归一化的值,可以轻松可视化和与下游分析方法集成。我们展示了 PBS 在广泛和狭窄的组蛋白标记中的应用,并提供了几个通过将 PBS 得分与下游数据类型集成可以获得的生物学见解的示例。