School of Computer Science and Engineering, UNSW Sydney, Sydney, NSW 2052, Australia.
School of Electrical Engineering and Telecommunications, UNSW Sydney, Sydney, NSW 2052, Australia.
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad046. Epub 2023 Jul 3.
Third-generation nanopore sequencers offer selective sequencing or "Read Until" that allows genomic reads to be analyzed in real time and abandoned halfway if not belonging to a genomic region of "interest." This selective sequencing opens the door to important applications such as rapid and low-cost genetic tests. The latency in analyzing should be as low as possible for selective sequencing to be effective so that unnecessary reads can be rejected as early as possible. However, existing methods that employ a subsequence dynamic time warping (sDTW) algorithm for this problem are too computationally intensive that a massive workstation with dozens of CPU cores still struggles to keep up with the data rate of a mobile phone-sized MinION sequencer.
In this article, we present Hardware Accelerated Read Until (HARU), a resource-efficient hardware-software codesign-based method that exploits a low-cost and portable heterogeneous multiprocessor system-on-chip platform with on-chip field-programmable gate arrays (FPGA) to accelerate the sDTW-based Read Until algorithm. Experimental results show that HARU on a Xilinx FPGA embedded with a 4-core ARM processor is around 2.5× faster than a highly optimized multithreaded software version (around 85× faster than the existing unoptimized multithreaded software) running on a sophisticated server with a 36-core Intel Xeon processor for a SARS-CoV-2 dataset. The energy consumption of HARU is 2 orders of magnitudes lower than the same application executing on the 36-core server.
HARU demonstrates that nanopore selective sequencing is possible on resource-constrained devices through rigorous hardware-software optimizations. The source code for the HARU sDTW module is available as open source at https://github.com/beebdev/HARU, and an example application that uses HARU is at https://github.com/beebdev/sigfish-haru.
第三代纳米孔测序仪提供了选择性测序或“Read Until”,可以实时分析基因组读数,如果不属于“感兴趣”的基因组区域,则可以中途放弃。这种选择性测序为快速、低成本的基因检测等重要应用打开了大门。为了使选择性测序有效,分析的延迟应尽可能低,以便尽早拒绝不必要的读数。然而,现有的使用子序列动态时间规整(sDTW)算法来解决此问题的方法计算量太大,即使使用具有数十个 CPU 内核的大型工作站也难以跟上手机大小的 MinION 测序仪的数据速率。
在本文中,我们提出了 Hardware Accelerated Read Until (HARU),这是一种基于资源高效的软硬件协同设计的方法,利用低成本、便携式异构多核片上系统(SoC)平台和片上现场可编程门阵列(FPGA)来加速基于 sDTW 的 Read Until 算法。实验结果表明,在 Xilinx FPGA 上嵌入 4 核 ARM 处理器的 HARU 比在具有 36 核 Intel Xeon 处理器的复杂服务器上运行的高度优化的多线程软件版本(比现有未优化的多线程软件快 85 倍左右)快 2.5 倍左右对于 SARS-CoV-2 数据集。HARU 的能耗比在具有 36 核服务器上执行的相同应用低 2 个数量级。
HARU 通过严格的软硬件优化证明了在资源受限的设备上进行纳米孔选择性测序是可行的。HARU 的 sDTW 模块的源代码可在 https://github.com/beebdev/HARU 上作为开源获取,使用 HARU 的示例应用可在 https://github.com/beebdev/sigfish-haru 上获取。