Department of Information Engineering, The Chinese University of Hong Kong, Hong Kong.
Department of Genetics and Genomic Sciences, Mount Sinai School of Medicine, New York University, New York, NY, USA.
Bioinformatics. 2017 Dec 1;33(23):3701-3708. doi: 10.1093/bioinformatics/btx467.
DNA methylation is an important epigenetic mechanism in gene regulation and the detection of differentially methylated regions (DMRs) is enthralling for many disease studies. There are several aspects that we can improve over existing DMR detection methods: (i) methylation statuses of nearby CpG sites are highly correlated, but this fact has seldom been modelled rigorously due to the uneven spacing; (ii) it is practically important to be able to handle both paired and unpaired samples; and (iii) the capability to detect DMRs from a single pair of samples is demanded.
We present DMRMark (DMR detection based on non-homogeneous hidden Markov model), a novel Bayesian framework for detecting DMRs from methylation array data. It combines the constrained Gaussian mixture model that incorporates the biological knowledge with the non-homogeneous hidden Markov model that models spatial correlation. Unlike existing methods, our DMR detection is achieved without predefined boundaries or decision windows. Furthermore, our method can detect DMRs from a single pair of samples and can also incorporate unpaired samples. Both simulation studies and real datasets from The Cancer Genome Atlas showed the significant improvement of DMRMark over other methods.
DMRMark is freely available as an R package at the CRAN R package repository.
Supplementary data are available at Bioinformatics online.
DNA 甲基化是基因调控中的一种重要表观遗传机制,检测差异甲基化区域(DMR)是许多疾病研究的热点。在现有 DMR 检测方法的基础上,我们可以在以下几个方面进行改进:(i)附近 CpG 位点的甲基化状态高度相关,但由于不均匀的间隔,这一事实很少被严格建模;(ii)能够处理配对和非配对样本是非常重要的;(iii)需要能够从单个样本对中检测到 DMR。
我们提出了 DMRMark(基于非均匀隐马尔可夫模型的 DMR 检测),这是一种用于从甲基化阵列数据中检测 DMR 的新贝叶斯框架。它将包含生物学知识的约束高斯混合模型与建模空间相关性的非均匀隐马尔可夫模型相结合。与现有方法不同,我们的 DMR 检测无需预定义的边界或决策窗口。此外,我们的方法可以从单个样本对中检测 DMR,也可以整合非配对样本。模拟研究和来自癌症基因组图谱的真实数据集都表明,DMRMark 明显优于其他方法。
DMRMark 作为一个 R 包可在 CRAN R 包存储库中免费获得。
补充数据可在 Bioinformatics 在线获取。