Arram James, Kaplan Thomas, Luk Wayne, Jiang Peiyong
IEEE/ACM Trans Comput Biol Bioinform. 2017 May-Jun;14(3):668-677. doi: 10.1109/TCBB.2016.2535385. Epub 2016 Feb 29.
One of the key challenges facing genomics today is how to efficiently analyze the massive amounts of data produced by next-generation sequencing platforms. With general-purpose computing systems struggling to address this challenge, specialized processors such as the Field-Programmable Gate Array (FPGA) are receiving growing interest. The means by which to leverage this technology for accelerating genomic data analysis is however largely unexplored. In this paper, we present a runtime reconfigurable architecture for accelerating short read alignment using FPGAs. This architecture exploits the reconfigurability of FPGAs to allow the development of fast yet flexible alignment designs. We apply this architecture to develop an alignment design which supports exact and approximate alignment with up to two mismatches. Our design is based on the FM-index, with optimizations to improve the alignment performance. In particular, the n-step FM-index, index oversampling, a seed-and-compare stage, and bi-directional backtracking are included. Our design is implemented and evaluated on a 1U Maxeler MPC-X2000 dataflow node with eight Altera Stratix-V FPGAs. Measurements show that our design is 28 times faster than Bowtie2 running with 16 threads on dual Intel Xeon E5-2640 CPUs, and nine times faster than Soap3-dp running on an NVIDIA Tesla C2070 GPU.
当今基因组学面临的关键挑战之一是如何有效分析下一代测序平台产生的海量数据。由于通用计算系统难以应对这一挑战,诸如现场可编程门阵列(FPGA)之类的专用处理器正受到越来越多的关注。然而,利用这项技术加速基因组数据分析的方法在很大程度上尚未得到探索。在本文中,我们提出了一种用于使用FPGA加速短读比对的运行时可重构架构。该架构利用FPGA的可重构性来实现快速且灵活的比对设计。我们应用此架构开发了一种比对设计,该设计支持最多两个错配的精确比对和近似比对。我们的设计基于FM索引,并进行了优化以提高比对性能。特别是,纳入了n步FM索引、索引过采样、种子与比对阶段以及双向回溯。我们的设计在配备八个Altera Stratix-V FPGA的1U Maxeler MPC-X2000数据流节点上实现并进行了评估。测量结果表明,我们的设计比在双英特尔至强E5-2640 CPU上以16线程运行的Bowtie2快28倍,比在NVIDIA Tesla C2070 GPU上运行的Soap3-dp快9倍。