Suppr超能文献

DeepBAM:一种用于牛津纳米孔测序的高精度单分子 CpG 甲基化检测工具。

DeepBAM: a high-accuracy single-molecule CpG methylation detection tool for Oxford nanopore sequencing.

机构信息

State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, 7 Jinsui Road, Tianhe District, Guangzhou 510060, China.

School of Artificial Intelligence, Sun Yat-Sen University, Gaoxin District, Zhuhai 519000, China.

出版信息

Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae413.

Abstract

Recent nanopore sequencing system (R10.4) has enhanced base calling accuracy and is being increasingly utilized for detecting CpG methylation state. However, the robustness and universality of the methylation calling model in officially supplied Dorado remains poorly tested. In this study, we obtained heterogeneous datasets from human and plant sources to carry out comprehensive evaluations, which showed that Dorado performed significantly different across datasets. We therefore developed deep neural networks and implemented several optimizations in training a new model called DeepBAM. DeepBAM achieved superior and more stable performances compared with Dorado, including higher area under the ROC curves (98.47% on average and up to 7.36% improvement) and F1 scores (94.97% on average and up to 16.24% improvement) across the datasets. DeepBAM-based whole genome methylation frequencies have achieved >0.95 correlations with BS-seq on four of five datasets, outperforming Dorado in all instances. It enables unraveling allele-specific methylation patterns, including regions of transposable elements. The enhanced performance of DeepBAM paves the way for broader applications of nanopore sequencing in CpG methylation studies.

摘要

最近的纳米孔测序系统(R10.4)提高了碱基调用的准确性,并且越来越多地用于检测 CpG 甲基化状态。然而,官方提供的 Dorado 中的甲基化调用模型的稳健性和通用性仍未得到充分测试。在这项研究中,我们从人类和植物来源获得了异构数据集,以进行全面评估,结果表明 Dorado 在不同数据集之间的表现差异显著。因此,我们开发了深度神经网络,并在训练一个名为 DeepBAM 的新模型时实施了几项优化。与 Dorado 相比,DeepBAM 实现了更高的性能和更稳定的性能,包括在所有数据集上更高的 ROC 曲线下面积(平均为 98.47%,最高可提高 7.36%)和 F1 分数(平均为 94.97%,最高可提高 16.24%)。基于 DeepBAM 的全基因组甲基化频率与五个数据集上的 BS-seq 之间的相关性大于 0.95,在所有情况下都优于 Dorado。它能够揭示等位基因特异性甲基化模式,包括转座元件区域。DeepBAM 性能的提高为纳米孔测序在 CpG 甲基化研究中的更广泛应用铺平了道路。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/77d0/11342253/a6167fe891e9/bbae413f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验