Department of Computer Sciences, Barcelona Supercomputing Center, Barcelona 08034, Spain.
Department of Computer Science, Universitat Politècnica de Catalunya, Barcelona 08034, Spain.
Bioinformatics. 2024 Nov 1;40(11). doi: 10.1093/bioinformatics/btae631.
Recent advances in sequencing technologies have stressed the critical role of sequence analysis algorithms and tools in genomics and healthcare research. In particular, sequence alignment is a fundamental building block in many sequence analysis pipelines and is frequently a performance bottleneck both in terms of execution time and memory usage. Classical sequence alignment algorithms are based on dynamic programming and often require quadratic time and memory with respect to the sequence length. As a result, classical sequence alignment algorithms fail to scale with increasing sequence lengths and quickly become memory-bound due to data-movement penalties.
Processing-In-Memory (PIM) is an emerging architectural paradigm that seeks to accelerate memory-bound algorithms by bringing computation closer to the data to mitigate data-movement penalties. This work presents BIMSA (Bidirectional In-Memory Sequence Alignment), a PIM design and implementation for the state-of-the-art sequence alignment algorithm BiWFA (Bidirectional Wavefront Alignment), incorporating new hardware-aware optimizations for a production-ready PIM architecture (UPMEM). BIMSA supports aligning sequences up to 100K bases, exceeding the limitations of state-of-the-art PIM implementations. First, BIMSA achieves speedups up to 22.24× (11.95× on average) compared to state-of-the-art PIM-enabled implementations of sequence alignment algorithms. Second, achieves speedups up to 5.84× (2.83× on average) compared to the highest-performance multicore CPU implementation of BiWFA. Third, BIMSA exhibits linear scalability with the number of compute units in memory, enabling further performance improvements with upcoming PIM architectures equipped with more compute units and achieving speedups up to 9.56× (4.7× on average).
Code and documentation are publicly available at https://github.com/AlejandroAMarin/BIMSA.
测序技术的最新进展强调了序列分析算法和工具在基因组学和医疗保健研究中的关键作用。特别是,序列比对是许多序列分析管道的基本构建块,无论是在执行时间还是内存使用方面,通常都是性能瓶颈。经典的序列比对算法基于动态规划,通常需要与序列长度成二次方的时间和内存。因此,经典的序列比对算法无法随着序列长度的增加而扩展,并且由于数据移动开销很快就会受到内存的限制。
处理内存在(PIM)是一种新兴的架构范例,旨在通过将计算更接近数据来加速受内存限制的算法,从而减轻数据移动开销。这项工作提出了 BIMSA(双向内存序列比对),这是一种针对最先进的序列比对算法 BiWFA(双向波前比对)的 PIM 设计和实现,为生产就绪的 PIM 架构(UPMEM)纳入了新的硬件感知优化。BIMSA 支持比对长达 100K 个碱基的序列,超过了最先进的 PIM 实现的限制。首先,与最先进的 PIM 启用的序列比对算法实现相比,BIMSA 实现了高达 22.24 倍(平均 11.95 倍)的加速。其次,与 BiWFA 的最高性能多核 CPU 实现相比,实现了高达 5.84 倍(平均 2.83 倍)的加速。第三,BIMSA 具有与内存中计算单元数量的线性可扩展性,使具有更多计算单元的即将推出的 PIM 架构能够进一步提高性能,并实现高达 9.56 倍(平均 4.7 倍)的加速。
代码和文档可在 https://github.com/AlejandroAMarin/BIMSA 上公开获取。