Division of Biostatistics, University of California, Berkeley, Berkeley, California.
Department of Statistics, University of California, Berkeley, Berkeley, California.
J Comput Biol. 2020 Apr;27(4):458-471. doi: 10.1089/cmb.2019.0326. Epub 2020 Mar 16.
Whole-genome bisulfite sequencing (WGBS) provides a precise measure of methylation across the genome, yet presents a challenge in identifying differentially methylated regions (DMRs) between different conditions. Many methods have been developed, which focus primarily on the setting of two-group comparison. We develop a DMR detecting method MethCP for WGBS data, which is applicable for a wide range of experimental designs beyond the two-group comparisons, such as time-course data. MethCP identifies DMRs based on change point detection, which naturally segments the genome and provides region-level differential analysis. For simple two-group comparison, we show that our method outperforms developed methods in accurately detecting the complete DMR on a simulated data set and an Arabidopsis data set. Moreover, we show that MethCP is capable of detecting wide regions with small effect sizes, which can be common in some settings, but existing techniques are poor in detecting such DMRs. We also demonstrate the use of MethCP for time-course data on another data set after methylation throughout seed germination in Arabidopsis.
全基因组亚硫酸氢盐测序(WGBS)提供了一种精确测量基因组中甲基化的方法,但在识别不同条件下的差异甲基化区域(DMR)方面存在挑战。已经开发了许多方法,这些方法主要集中在两组比较的设置上。我们开发了一种用于 WGBS 数据的 DMR 检测方法 MethCP,它适用于广泛的实验设计,超出了两组比较的范围,例如时间序列数据。MethCP 基于变化点检测来识别 DMR,这种方法可以自然地对基因组进行分段,并提供区域级别的差异分析。对于简单的两组比较,我们表明我们的方法在模拟数据集和拟南芥数据集上准确检测完整 DMR 的性能优于已开发的方法。此外,我们表明 MethCP 能够检测到具有小效应大小的宽区域,这种情况在某些情况下很常见,但现有技术在检测这种 DMR 方面表现不佳。我们还展示了在拟南芥种子萌发过程中整个种子进行甲基化后,使用 MethCP 对另一组时间序列数据的处理。