Interdisciplinary Ph.D. Program in Biostatistics, The Ohio State University, Columbus, OH, USA.
Department of Statistics, Keimyung University, South Korea, Korea.
Methods Mol Biol. 2022;2432:167-185. doi: 10.1007/978-1-0716-1994-0_13.
High-throughput assays have been developed to measure DNA methylation, among which bisulfite-based sequencing (BS-seq) and microarray technologies are the most popular for genome-wide profiling. A major goal in DNA methylation analysis is the detection of differentially methylated genomic regions under two different conditions. To accomplish this, many state-of-the-art methods have been proposed in the past few years; only a handful of these methods are capable of analyzing both types of data (BS-seq and microarray), though. On the other hand, covariates, such as sex and age, are known to be potentially influential on DNA methylation; and thus, it would be important to adjust for their effects on differential methylation analysis. In this chapter, we describe a Bayesian curve credible bands approach and the accompanying software, BCurve, for detecting differentially methylated regions for data generated from either microarray or BS-Seq. The unified theme underlying the analysis of these two different types of data is the model that accounts for correlation between DNA methylation in nearby sites, covariates, and between-sample variability. The BCurve R software package also provides tools for simulating both microarray and BS-seq data, which can be useful for facilitating comparisons of methods given the known "gold standard" in the simulated data. We provide detailed description of the main functions in BCurve and demonstrate the utility of the package for analyzing data from both platforms using simulated data from the functions provided in the package. Analyses of two real datasets, one from BS-seq and one from microarray, are also furnished to further illustrate the capability of BCurve.
已开发出高通量测定 DNA 甲基化的方法,其中亚硫酸氢盐测序(BS-seq)和微阵列技术是用于全基因组分析的最受欢迎的方法。DNA 甲基化分析的主要目标是检测两种不同条件下的差异甲基化基因组区域。为了实现这一目标,过去几年提出了许多最先进的方法;尽管如此,这些方法中只有少数几种能够分析这两种类型的数据(BS-seq 和微阵列)。另一方面,协变量,如性别和年龄,已知对 DNA 甲基化有潜在影响;因此,在差异甲基化分析中调整其影响非常重要。在本章中,我们描述了一种贝叶斯曲线置信带方法和随附的软件 BCurve,用于检测微阵列或 BS-Seq 生成的数据中的差异甲基化区域。分析这两种不同类型数据的统一主题是考虑附近位点 DNA 甲基化、协变量和样本间变异性之间相关性的模型。BCurve R 软件包还提供了用于模拟微阵列和 BS-seq 数据的工具,这对于给定模拟数据中的已知“金标准”,比较方法非常有用。我们详细描述了 BCurve 中的主要功能,并使用该软件包提供的功能提供的模拟数据演示了该软件包的实用性。还提供了两个真实数据集的分析,一个来自 BS-seq,一个来自微阵列,以进一步说明 BCurve 的能力。