Genner Rylee, Akeson Stuart, Meredith Melissa, Jerez Pilar Alvarez, Malik Laksh, Baker Breeana, Miano-Burkhardt Abigail, Paten Benedict, Billingsley Kimberley J, Blauwendraat Cornelis, Jain Miten
Center for Alzheimer's and Related Dementias, National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA.
Department of Biology, Johns Hopkins University, Baltimore, MD, USA.
bioRxiv. 2024 Mar 1:2024.02.29.581569. doi: 10.1101/2024.02.29.581569.
DNA methylation most commonly occurs as 5-methylcytosine (5-mC) in the human genome and has been associated with human diseases. Recent developments in single-molecule sequencing technologies (Oxford Nanopore Technologies (ONT) and Pacific Biosciences) have enabled readouts of long, native DNA molecules, including cytosine methylation. ONT recently upgraded their Nanopore sequencing chemistry and kits from R9 to the R10 version, which yielded increased accuracy and sequencing throughput. However the effects on methylation detection have not yet been documented. Here we performed a series of computational analyses to characterize differences in Nanopore-based 5mC detection between the ONT R9 and R10 chemistries. We compared 5mC calls in R9 and R10 for three human genome datasets: a cell line, a frontal cortex brain sample, and a blood sample. We performed an in-depth analysis on CpG islands and homopolymer regions, and documented high concordance for methylation detection among sequencing technologies. The strongest correlation was observed between Nanopore R10 and Illumina bisulfite technologies for cell line-derived datasets. Subtle differences in methylation datasets between technologies can impact analysis tools such as differential methylation calling software. Our findings show that comparisons can be drawn between methylation data from different Nanopore chemistries using guided hypotheses. This work will facilitate comparison among Nanopore data cohorts derived using different chemistries from large scale sequencing efforts, such as the NIH CARD Long Read Initiative.
在人类基因组中,DNA甲基化最常见的形式是5-甲基胞嘧啶(5-mC),并且与人类疾病有关。单分子测序技术(牛津纳米孔技术公司(ONT)和太平洋生物科学公司)的最新进展使得能够读取包括胞嘧啶甲基化在内的长链天然DNA分子。ONT最近将其纳米孔测序化学方法和试剂盒从R9版本升级到了R10版本,这提高了准确性和测序通量。然而,其对甲基化检测的影响尚未见报道。在此,我们进行了一系列计算分析,以表征ONT R9和R10化学方法在基于纳米孔的5mC检测中的差异。我们比较了R9和R10对三个人类基因组数据集(一个细胞系、一个额叶皮质脑样本和一个血液样本)的5mC调用情况。我们对CpG岛和同聚物区域进行了深入分析,并记录了测序技术之间甲基化检测的高度一致性。在细胞系衍生数据集的纳米孔R10和Illumina亚硫酸氢盐技术之间观察到最强的相关性。技术之间甲基化数据集的细微差异可能会影响诸如差异甲基化调用软件等分析工具。我们的研究结果表明,可以使用有指导的假设对来自不同纳米孔化学方法的甲基化数据进行比较。这项工作将有助于比较使用不同化学方法从大规模测序工作(如美国国立卫生研究院CARD长读长计划)中获得的纳米孔数据队列。