Suppr超能文献

GateKeeper:一种用于加速 DNA 短读映射预对齐的新硬件架构。

GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping.

机构信息

Department of Computer Engineering, Bilkent University, Bilkent, Ankara 06800, Turkey.

TOBB University of Economics & Technology, Sogutozu, Ankara, Turkey.

出版信息

Bioinformatics. 2017 Nov 1;33(21):3355-3363. doi: 10.1093/bioinformatics/btx342.

Abstract

MOTIVATION

High throughput DNA sequencing (HTS) technologies generate an excessive number of small DNA segments -called short reads- that cause significant computational burden. To analyze the entire genome, each of the billions of short reads must be mapped to a reference genome based on the similarity between a read and 'candidate' locations in that reference genome. The similarity measurement, called alignment, formulated as an approximate string matching problem, is the computational bottleneck because: (i) it is implemented using quadratic-time dynamic programming algorithms and (ii) the majority of candidate locations in the reference genome do not align with a given read due to high dissimilarity. Calculating the alignment of such incorrect candidate locations consumes an overwhelming majority of a modern read mapper's execution time. Therefore, it is crucial to develop a fast and effective filter that can detect incorrect candidate locations and eliminate them before invoking computationally costly alignment algorithms.

RESULTS

We propose GateKeeper, a new hardware accelerator that functions as a pre-alignment step that quickly filters out most incorrect candidate locations. GateKeeper is the first design to accelerate pre-alignment using Field-Programmable Gate Arrays (FPGAs), which can perform pre-alignment much faster than software. When implemented on a single FPGA chip, GateKeeper maintains high accuracy (on average >96%) while providing, on average, 90-fold and 130-fold speedup over the state-of-the-art software pre-alignment techniques, Adjacency Filter and Shifted Hamming Distance (SHD), respectively. The addition of GateKeeper as a pre-alignment step can reduce the verification time of the mrFAST mapper by a factor of 10.

AVAILABILITY AND IMPLEMENTATION

https://github.com/BilkentCompGen/GateKeeper.

CONTACT

mohammedalser@bilkent.edu.tr or onur.mutlu@inf.ethz.ch or calkan@cs.bilkent.edu.tr.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

高通量 DNA 测序 (HTS) 技术会产生大量的小 DNA 片段,称为短读段,这会带来巨大的计算负担。为了分析整个基因组,必须将数十亿个短读段中的每一个都根据读段与参考基因组中“候选”位置之间的相似性映射到参考基因组上。这种相似性测量,称为比对,其形式为近似字符串匹配问题,是计算的瓶颈,原因如下:(i) 它是使用二次时间动态规划算法实现的;(ii) 由于高度不相似,参考基因组中的大多数候选位置与给定的读段不匹配。计算这些不正确的候选位置的比对消耗了现代读段映射器执行时间的绝大多数。因此,开发一种快速有效的过滤器至关重要,该过滤器可以在调用计算成本高昂的比对算法之前检测到不正确的候选位置并将其剔除。

结果

我们提出了 GateKeeper,这是一种新的硬件加速器,可以作为预比对步骤,快速过滤掉大多数不正确的候选位置。GateKeeper 是第一个使用现场可编程门阵列 (FPGA) 加速预比对的设计,它可以比软件更快地执行预比对。当在单个 FPGA 芯片上实现时,GateKeeper 保持了很高的准确性(平均 >96%),同时与最先进的软件预比对技术 Adjacency Filter 和 Shifted Hamming Distance (SHD) 相比,平均速度分别提高了 90 倍和 130 倍。将 GateKeeper 作为预比对步骤添加可以将 mrFAST 映射器的验证时间缩短 10 倍。

可用性和实现

https://github.com/BilkentCompGen/GateKeeper。

联系方式

mohammedalser@bilkent.edu.tronur.mutlu@inf.ethz.chcalkan@cs.bilkent.edu.tr

补充信息

补充数据可在 Bioinformatics 在线获取。

相似文献

4
A hybrid short read mapping accelerator.一种混合短读映射加速器。
BMC Bioinformatics. 2013 Feb 26;14:67. doi: 10.1186/1471-2105-14-67.
5
Ψ-RA: a parallel sparse index for genomic read alignment.Ψ-RA:一种用于基因组读取比对的并行稀疏索引。
BMC Genomics. 2011;12 Suppl 2(Suppl 2):S7. doi: 10.1186/1471-2164-12-S2-S7. Epub 2011 Jul 27.
6
LAMSA: fast split read alignment with long approximate matches.LAMSA:快速分裂读取比对算法,具有长近似匹配功能。
Bioinformatics. 2017 Jan 15;33(2):192-201. doi: 10.1093/bioinformatics/btw594. Epub 2016 Sep 25.

引用本文的文献

本文引用的文献

1
Short Read Mapping: An Algorithmic Tour.短读映射:算法之旅。
Proc IEEE Inst Electr Electron Eng. 2017 Mar;105(3):436-458. doi: 10.1109/JPROC.2015.2455551. Epub 2015 Sep 7.
2
HiLive: real-time mapping of illumina reads while sequencing.HiLive:测序时对Illumina reads进行实时映射
Bioinformatics. 2017 Mar 15;33(6):917-319. doi: 10.1093/bioinformatics/btw659.
6
Benchmarking short sequence mapping tools.短序列比对工具的基准测试。
BMC Bioinformatics. 2013 Jun 7;14:184. doi: 10.1186/1471-2105-14-184.
8
Accelerating read mapping with FastHASH.使用 FastHASH 加速读映射。
BMC Genomics. 2013;14 Suppl 1(Suppl 1):S13. doi: 10.1186/1471-2164-14-S1-S13. Epub 2013 Jan 21.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验