BLESS 2：精确、内存高效且快速的纠错方法。

BLESS 2: accurate, memory-efficient and fast error correction method.

作者信息

Heo Yun, Ramachandran Anand, Hwu Wen-Mei, Ma Jian, Chen Deming

机构信息

Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.

Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA.

出版信息

Bioinformatics. 2016 Aug 1;32(15):2369-71. doi: 10.1093/bioinformatics/btw146. Epub 2016 Mar 24.

DOI:10.1093/bioinformatics/btw146

PMID:27153708

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6280799/

Abstract

UNLABELLED

The most important features of error correction tools for sequencing data are accuracy, memory efficiency and fast runtime. The previous version of BLESS was highly memory-efficient and accurate, but it was too slow to handle reads from large genomes. We have developed a new version of BLESS to improve runtime and accuracy while maintaining a small memory usage. The new version, called BLESS 2, has an error correction algorithm that is more accurate than BLESS, and the algorithm has been parallelized using hybrid MPI and OpenMP programming. BLESS 2 was compared with five top-performing tools, and it was found to be the fastest when it was executed on two computing nodes using MPI, with each node containing twelve cores. Also, BLESS 2 showed at least 11% higher gain while retaining the memory efficiency of the previous version for large genomes.

AVAILABILITY AND IMPLEMENTATION

Freely available at https://sourceforge.net/projects/bless-ec

CONTACT

dchen@illinois.edu

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

未标注

测序数据纠错工具最重要的特性是准确性、内存效率和快速运行时间。BLESS的上一版本内存效率高且准确，但处理来自大型基因组的 reads 速度太慢。我们开发了新版本的BLESS，以提高运行时间和准确性，同时保持较小的内存使用量。新版本称为BLESS 2，其纠错算法比BLESS更准确，并且该算法已使用混合MPI和OpenMP编程进行了并行化处理。将BLESS 2与五个性能最佳的工具进行了比较，发现在使用MPI在两个计算节点上执行时，每个节点包含十二个核心，它是最快的。此外，BLESS 2在保持大型基因组上一版本内存效率的同时，增益至少高出11%。

可用性和实现方式

可在https://sourceforge.net/projects/bless-ec上免费获取

联系方式

dchen@illinois.edu

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

BLESS 2: accurate, memory-efficient and fast error correction method.

Bioinformatics. 2016 Aug 1;32(15):2369-71. doi: 10.1093/bioinformatics/btw146. Epub 2016 Mar 24.

BLESS: bloom filter-based error correction solution for high-throughput sequencing reads.

Bioinformatics. 2014 May 15;30(10):1354-62. doi: 10.1093/bioinformatics/btu030. Epub 2014 Jan 21.

A parallel algorithm for error correction in high-throughput short-read data on CUDA-enabled graphics hardware.

J Comput Biol. 2010 Apr;17(4):603-15. doi: 10.1089/cmb.2009.0062.

LightAssembler: fast and memory-efficient assembly algorithm for high-throughput sequencing reads.

Bioinformatics. 2016 Nov 1;32(21):3215-3223. doi: 10.1093/bioinformatics/btw470. Epub 2016 Jul 13.

Trowel: a fast and accurate error correction module for Illumina sequencing reads.

Bioinformatics. 2014 Nov 15;30(22):3264-5. doi: 10.1093/bioinformatics/btu513. Epub 2014 Jul 29.

FLAS: fast and high-throughput algorithm for PacBio long-read self-correction.

Bioinformatics. 2019 Oct 15;35(20):3953-3960. doi: 10.1093/bioinformatics/btz206.

Hybrid correction of highly noisy long reads using a variable-order de Bruijn graph.

Bioinformatics. 2018 Dec 15;34(24):4213-4222. doi: 10.1093/bioinformatics/bty521.

EC: an efficient error correction algorithm for short reads.

BMC Bioinformatics. 2015;16 Suppl 17(Suppl 17):S2. doi: 10.1186/1471-2105-16-S17-S2. Epub 2015 Dec 7.

NextPolish: a fast and efficient genome polishing tool for long-read assembly.

Bioinformatics. 2020 Apr 1;36(7):2253-2255. doi: 10.1093/bioinformatics/btz891.

BFC: correcting Illumina sequencing errors.

Bioinformatics. 2015 Sep 1;31(17):2885-7. doi: 10.1093/bioinformatics/btv290. Epub 2015 May 6.

引用本文的文献

MAC-ErrorReads: machine learning-assisted classifier for filtering erroneous NGS reads.

BMC Bioinformatics. 2024 Feb 7;25(1):61. doi: 10.1186/s12859-024-05681-1.

Illumina reads correction: evaluation and improvements.

Sci Rep. 2024 Jan 26;14(1):2232. doi: 10.1038/s41598-024-52386-9.

SparkEC: speeding up alignment-based DNA error correction tools.

BMC Bioinformatics. 2022 Nov 7;23(1):464. doi: 10.1186/s12859-022-05013-1.

Genome sequence assembly algorithms and misassembly identification methods.

Mol Biol Rep. 2022 Nov;49(11):11133-11148. doi: 10.1007/s11033-022-07919-8. Epub 2022 Sep 23.

CARE 2.0: reducing false-positive sequencing error corrections using machine learning.

BMC Bioinformatics. 2022 Jun 13;23(1):227. doi: 10.1186/s12859-022-04754-3.

A cross-sectional study to characterize local HIV-1 dynamics in Washington, DC using next-generation sequencing.

Sci Rep. 2020 Feb 6;10(1):1989. doi: 10.1038/s41598-020-58410-y.

Denoising of Aligned Genomic Data.

Sci Rep. 2019 Oct 21;9(1):15067. doi: 10.1038/s41598-019-51418-z.

Illumina error correction near highly repetitive DNA regions improves de novo genome assembly.

BMC Bioinformatics. 2019 Jun 3;20(1):298. doi: 10.1186/s12859-019-2906-2.

Mining statistically-solid k-mers for accurate NGS error correction.

BMC Genomics. 2018 Dec 31;19(Suppl 10):912. doi: 10.1186/s12864-018-5272-y.

Evaluation of the impact of Illumina error correction tools on de novo genome assembly.

BMC Bioinformatics. 2017 Aug 18;18(1):374. doi: 10.1186/s12859-017-1784-8.

本文引用的文献

QuorUM: An Error Corrector for Illumina Reads.

PLoS One. 2015 Jun 17;10(6):e0130821. doi: 10.1371/journal.pone.0130821. eCollection 2015.

BFC: correcting Illumina sequencing errors.

Bioinformatics. 2015 Sep 1;31(17):2885-7. doi: 10.1093/bioinformatics/btv290. Epub 2015 May 6.

KMC 2: fast and resource-frugal k-mer counting.

Bioinformatics. 2015 May 15;31(10):1569-76. doi: 10.1093/bioinformatics/btv022. Epub 2015 Jan 20.

Lighter: fast and memory-efficient sequencing error correction without counting.

Genome Biol. 2014;15(11):509. doi: 10.1186/s13059-014-0509-9.

BLESS: bloom filter-based error correction solution for high-throughput sequencing reads.

Bioinformatics. 2014 May 15;30(10):1354-62. doi: 10.1093/bioinformatics/btu030. Epub 2014 Jan 21.

QUAST: quality assessment tool for genome assemblies.

Bioinformatics. 2013 Apr 15;29(8):1072-5. doi: 10.1093/bioinformatics/btt086. Epub 2013 Feb 19.

Musket: a multistage k-mer spectrum-based error corrector for Illumina sequence data.

Bioinformatics. 2013 Feb 1;29(3):308-15. doi: 10.1093/bioinformatics/bts690. Epub 2012 Nov 29.

Gossamer--a resource-efficient de novo assembler.

Bioinformatics. 2012 Jul 15;28(14):1937-8. doi: 10.1093/bioinformatics/bts297. Epub 2012 May 18.

Efficient de novo assembly of large genomes using compressed data structures.

Genome Res. 2012 Mar;22(3):549-56. doi: 10.1101/gr.126953.111. Epub 2011 Dec 7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

BLESS 2：精确、内存高效且快速的纠错方法。

BLESS 2: accurate, memory-efficient and fast error correction method.

作者信息

机构信息

出版信息

UNLABELLED

AVAILABILITY AND IMPLEMENTATION

CONTACT

SUPPLEMENTARY INFORMATION

未标注

可用性和实现方式

联系方式

补充信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献