EMBL Australia Partner Laboratory Network, Australian National University, Canberra, ACT, Australia.
The John Curtin School of Medical Research, Australian National University, Canberra, ACT, Australia.
Nat Commun. 2021 Jun 8;12(1):3438. doi: 10.1038/s41467-021-23778-6.
DNA methylation plays a fundamental role in the control of gene expression and genome integrity. Although there are multiple tools that enable its detection from Nanopore sequencing, their accuracy remains largely unknown. Here, we present a systematic benchmarking of tools for the detection of CpG methylation from Nanopore sequencing using individual reads, control mixtures of methylated and unmethylated reads, and bisulfite sequencing. We found that tools have a tradeoff between false positives and false negatives and present a high dispersion with respect to the expected methylation frequency values. We described various strategies to improve the accuracy of these tools, including a consensus approach, METEORE ( https://github.com/comprna/METEORE ), based on the combination of the predictions from two or more tools that shows improved accuracy over individual tools. Snakemake pipelines are also provided for reproducibility and to enable the systematic application of our analyses to other datasets.
DNA 甲基化在基因表达和基因组完整性的控制中起着至关重要的作用。尽管有多种工具可以从纳米孔测序中检测到它,但它们的准确性在很大程度上仍不清楚。在这里,我们使用单读序列、甲基化和未甲基化读序列的对照混合物以及亚硫酸氢盐测序,对用于从纳米孔测序中检测 CpG 甲基化的工具进行了系统的基准测试。我们发现,这些工具在假阳性和假阴性之间存在权衡,并且相对于预期的甲基化频率值存在很高的分散性。我们描述了各种提高这些工具准确性的策略,包括一种共识方法,即 METEORE(https://github.com/comprna/METEORE),该方法基于两个或更多工具的预测的组合,与单个工具相比,其准确性得到了提高。还提供了 Snakemake 管道,以实现可重复性,并能够将我们的分析系统地应用于其他数据集。