Suppr超能文献

MultiNanopolish:用于减少Nanopolish中冗余计算的优化分组方法。

MultiNanopolish: refined grouping method for reducing redundant calculations in Nanopolish.

作者信息

Hu Kang, Huang Neng, Zou You, Liao Xingyu, Wang Jianxin

机构信息

Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China.

出版信息

Bioinformatics. 2021 Sep 9;37(17):2757-2760. doi: 10.1093/bioinformatics/btab078.

Abstract

MOTIVATION

Compared with the second-generation sequencing technologies, the third-generation sequencing technologies allows us to obtain longer reads (average ∼10 kbps, maximum 900 kbps), but brings a higher error rate (∼15% error rate). Nanopolish is a variant and methylation detection tool based on hidden Markov model, which uses Oxford Nanopore sequencing data for signal-level analysis. Nanopolish can greatly improve the accuracy of assembly, whereas it is limited by long running time since most executive parts of Nanopolish is a serial and computationally expensive process.

RESULTS

In this paper, we present an effective polishing tool, Multithreading Nanopolish (MultiNanopolish), which decomposes the whole process of iterative calculation in Nanopolish into small independent calculation tasks, making it possible to run this process in the parallel mode. Experimental results show that MultiNanopolish reduces running time by 50% with read-uncorrected assembler (Miniasm) and 20% with read-corrected assembler (Canu and Flye) based on 40 threads mode compared to the original Nanopolish.

AVAILABILITY AND IMPLEMENTATION

MultiNanopolish is available at GitHub: https://github.com/BioinformaticsCSU/MultiNanopolish.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

与第二代测序技术相比,第三代测序技术使我们能够获得更长的读长(平均约10kbps,最长900kbps),但带来了更高的错误率(约15%的错误率)。Nanopolish是一种基于隐马尔可夫模型的变异和甲基化检测工具,它使用牛津纳米孔测序数据进行信号水平分析。Nanopolish可以大大提高组装的准确性,然而,由于Nanopolish的大多数执行部分是串行且计算成本高昂的过程,其运行时间受到限制。

结果

在本文中,我们提出了一种有效的优化工具,多线程Nanopolish(MultiNanopolish),它将Nanopolish中迭代计算的整个过程分解为小的独立计算任务,使得该过程能够以并行模式运行。实验结果表明,与原始的Nanopolish相比,在40线程模式下,使用未校正读长的组装器(Miniasm)时,MultiNanopolish的运行时间减少了50%,使用校正读长的组装器(Canu和Flye)时减少了20%。

可用性和实现

MultiNanopolish可在GitHub上获取:https://github.com/BioinformaticsCSU/MultiNanopolish。

补充信息

补充数据可在《生物信息学》在线获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验