School of Biological Sciences, Georgia Institute of Technology.
Department of Statistics, Seoul National University.
Brief Bioinform. 2019 Jan 18;20(1):33-46. doi: 10.1093/bib/bbx077.
DNA methylation is one of the most extensively studied epigenetic modifications of genomic DNA. In recent years, sequencing of bisulfite-converted DNA, particularly via next-generation sequencing technologies, has become a widely popular method to study DNA methylation. This method can be readily applied to a variety of species, dramatically expanding the scope of DNA methylation studies beyond the traditionally studied human and mouse systems. In parallel to the increasing wealth of genomic methylation profiles, many statistical tools have been developed to detect differentially methylated loci (DMLs) or differentially methylated regions (DMRs) between biological conditions. We discuss and summarize several key properties of currently available tools to detect DMLs and DMRs from sequencing of bisulfite-converted DNA. However, the majority of the statistical tools developed for DML/DMR analyses have been validated using only mammalian data sets, and less priority has been placed on the analyses of invertebrate or plant DNA methylation data. We demonstrate that genomic methylation profiles of non-mammalian species are often highly distinct from those of mammalian species using examples of honey bees and humans. We then discuss how such differences in data properties may affect statistical analyses. Based on these differences, we provide three specific recommendations to improve the power and accuracy of DML and DMR analyses of invertebrate data when using currently available statistical tools. These considerations should facilitate systematic and robust analyses of DNA methylation from diverse species, thus advancing our understanding of DNA methylation.
DNA 甲基化是基因组 DNA 中研究最广泛的表观遗传修饰之一。近年来,亚硫酸氢盐转化 DNA 的测序,特别是通过下一代测序技术,已成为研究 DNA 甲基化的一种广泛应用的方法。这种方法可以很容易地应用于各种物种,极大地扩展了 DNA 甲基化研究的范围,超越了传统的人类和小鼠系统。随着基因组甲基化图谱的不断丰富,许多统计工具也被开发出来,用于检测生物条件之间的差异甲基化位点(DMLs)或差异甲基化区域(DMRs)。我们讨论并总结了目前可用于检测亚硫酸氢盐转化 DNA 测序中 DML 和 DMR 的几种关键工具的特性。然而,大多数为 DML/DMR 分析开发的统计工具仅使用哺乳动物数据集进行了验证,而对无脊椎动物或植物 DNA 甲基化数据的分析则重视不够。我们通过蜜蜂和人类的例子证明,非哺乳动物物种的基因组甲基化图谱通常与哺乳动物物种的图谱有很大的不同。然后,我们讨论了这些数据特性的差异可能如何影响统计分析。基于这些差异,我们针对使用现有统计工具分析无脊椎动物数据时,提出了三个具体建议,以提高 DML 和 DMR 分析的功效和准确性。这些考虑因素应有助于从不同物种中进行系统和稳健的 DNA 甲基化分析,从而增进我们对 DNA 甲基化的理解。