低语：读排序允许对 DNA 测序数据进行稳健的映射。

Whisper: read sorting allows robust mapping of DNA sequencing data.

机构信息

Institute of Informatics, Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Akademicka 16, Gliwice, PL, Poland.

Institute of Applied Computer Science, Faculty of Electrical, Electronic, Computer and Control Engineering, Lodz University of Technology, Stefanowskiego 18/22, Łódź, PL, Poland.

出版信息

Bioinformatics. 2019 Jun 1;35(12):2043-2050. doi: 10.1093/bioinformatics/bty927.

DOI:10.1093/bioinformatics/bty927

PMID:30407485

Abstract

MOTIVATION

Mapping reads to a reference genome is often the first step in a sequencing data analysis pipeline. The reduction of sequencing costs implies a need for algorithms able to process increasing amounts of generated data in reasonable time.

RESULTS

We present Whisper, an accurate and high-performant mapping tool, based on the idea of sorting reads and then mapping them against suffix arrays for the reference genome and its reverse complement. Employing task and data parallelism as well as storing temporary data on disk result in superior time efficiency at reasonable memory requirements. Whisper excels at large NGS read collections, in particular Illumina reads with typical WGS coverage. The experiments with real data indicate that our solution works in about 15% of the time needed by the well-known BWA-MEM and Bowtie2 tools at a comparable accuracy, validated in a variant calling pipeline.

AVAILABILITY AND IMPLEMENTATION

Whisper is available for free from https://github.com/refresh-bio/Whisper or http://sun.aei.polsl.pl/REFRESH/Whisper/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

将读取内容映射到参考基因组通常是测序数据分析管道的第一步。测序成本的降低意味着需要能够在合理的时间内处理越来越多生成数据的算法。

结果

我们提出了 Whisper，这是一种基于排序读取内容并将其与参考基因组及其反转互补的后缀数组进行映射的准确且高性能的映射工具。采用任务和数据并行以及在磁盘上存储临时数据的方法，在合理的内存要求下实现了卓越的时间效率。Whisper 在大型 NGS 读取集合中表现出色，特别是具有典型 WGS 覆盖度的 Illumina 读取内容。使用真实数据的实验表明，我们的解决方案在可比精度下，可比 BWA-MEM 和 Bowtie2 等知名工具快约 15%，并且在变异调用管道中得到了验证。

可用性和实现

Whisper 可从 https://github.com/refresh-bio/Whisper 或 http://sun.aei.polsl.pl/REFRESH/Whisper/ 免费获得。

补充信息

补充数据可在 Bioinformatics 在线获取。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

低语：读排序允许对 DNA 测序数据进行稳健的映射。

Whisper: read sorting allows robust mapping of DNA sequencing data.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息

相似文献

引用本文的文献

低语：读排序允许对 DNA 测序数据进行稳健的映射。

Whisper: read sorting allows robust mapping of DNA sequencing data.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息

相似文献

引用本文的文献