• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SneakySnake:一种适用于CPU、GPU和FPGA的快速且准确的通用基因组预比对过滤器。

SneakySnake: a fast and accurate universal genome pre-alignment filter for CPUs, GPUs and FPGAs.

作者信息

Alser Mohammed, Shahroodi Taha, Gómez-Luna Juan, Alkan Can, Mutlu Onur

机构信息

Department of Computer Science, ETH Zurich, Zurich 8006, Switzerland.

Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich 8006, Switzerland.

出版信息

Bioinformatics. 2021 Apr 1;36(22-23):5282-5290. doi: 10.1093/bioinformatics/btaa1015.

DOI:10.1093/bioinformatics/btaa1015
PMID:33315064
Abstract

MOTIVATION

We introduce SneakySnake, a highly parallel and highly accurate pre-alignment filter that remarkably reduces the need for computationally costly sequence alignment. The key idea of SneakySnake is to reduce the approximate string matching (ASM) problem to the single net routing (SNR) problem in VLSI chip layout. In the SNR problem, we are interested in finding the optimal path that connects two terminals with the least routing cost on a special grid layout that contains obstacles. The SneakySnake algorithm quickly solves the SNR problem and uses the found optimal path to decide whether or not performing sequence alignment is necessary. Reducing the ASM problem into SNR also makes SneakySnake efficient to implement on CPUs, GPUs and FPGAs.

RESULTS

SneakySnake significantly improves the accuracy of pre-alignment filtering by up to four orders of magnitude compared to the state-of-the-art pre-alignment filters, Shouji, GateKeeper and SHD. For short sequences, SneakySnake accelerates Edlib (state-of-the-art implementation of Myers's bit-vector algorithm) and Parasail (state-of-the-art sequence aligner with a configurable scoring function), by up to 37.7× and 43.9× (>12× on average), respectively, with its CPU implementation, and by up to 413× and 689× (>400× on average), respectively, with FPGA and GPU acceleration. For long sequences, the CPU implementation of SneakySnake accelerates Parasail and KSW2 (sequence aligner of minimap2) by up to 979× (276.9× on average) and 91.7× (31.7× on average), respectively. As SneakySnake does not replace sequence alignment, users can still obtain all capabilities (e.g. configurable scoring functions) of the aligner of their choice, unlike existing acceleration efforts that sacrifice some aligner capabilities.

AVAILABILITYAND IMPLEMENTATION

https://github.com/CMU-SAFARI/SneakySnake.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

我们引入了SneakySnake,这是一种高度并行且高度准确的预比对过滤器,能显著减少对计算成本高昂的序列比对的需求。SneakySnake的关键思想是将近似字符串匹配(ASM)问题简化为超大规模集成电路(VLSI)芯片布局中的单网路由(SNR)问题。在SNR问题中,我们关注的是在包含障碍物的特殊网格布局上找到连接两个终端且路由成本最低的最优路径。SneakySnake算法能快速解决SNR问题,并利用找到的最优路径来决定是否有必要进行序列比对。将ASM问题简化为SNR问题也使得SneakySnake能够在CPU、GPU和FPGA上高效实现。

结果

与当前最先进的预比对过滤器Shouji、GateKeeper和SHD相比,SneakySnake将预比对过滤的准确率显著提高了多达四个数量级。对于短序列,SneakySnake的CPU实现分别将Edlib(Myers比特向量算法的当前最先进实现)和Parasail(具有可配置评分函数的当前最先进序列比对器)加速了多达37.7倍和43.9倍(平均>12倍),而通过FPGA和GPU加速时分别高达413倍和689倍(平均>400倍)。对于长序列,SneakySnake的CPU实现分别将Parasail和KSW2(minimap2的序列比对器)加速了多达979倍(平均276.9倍)和91.7倍(平均31.7倍)。由于SneakySnake并不替代序列比对,与现有的牺牲某些比对器功能的加速方法不同,用户仍然可以获得他们选择的比对器的所有功能(例如可配置评分函数)。

可用性与实现

https://github.com/CMU - SAFARI/SneakySnake。

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

1
SneakySnake: a fast and accurate universal genome pre-alignment filter for CPUs, GPUs and FPGAs.SneakySnake:一种适用于CPU、GPU和FPGA的快速且准确的通用基因组预比对过滤器。
Bioinformatics. 2021 Apr 1;36(22-23):5282-5290. doi: 10.1093/bioinformatics/btaa1015.
2
Shouji: a fast and efficient pre-alignment filter for sequence alignment.狩集:一种快速高效的序列比对预对齐滤波器。
Bioinformatics. 2019 Nov 1;35(21):4255-4263. doi: 10.1093/bioinformatics/btz234.
3
Scrooge: a fast and memory-frugal genomic sequence aligner for CPUs, GPUs, and ASICs.Scrooge:一种用于 CPU、GPU 和 ASIC 的快速且节省内存的基因组序列比对器。
Bioinformatics. 2023 May 4;39(5). doi: 10.1093/bioinformatics/btad151.
4
GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping.GateKeeper:一种用于加速 DNA 短读映射预对齐的新硬件架构。
Bioinformatics. 2017 Nov 1;33(21):3355-3363. doi: 10.1093/bioinformatics/btx342.
5
SeedHit: A GPU Friendly Pre-Align Filtering Algorithm.SeedHit:一种对GPU友好的预比对过滤算法。
IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):1794-1802. doi: 10.1109/TCBB.2024.3417517. Epub 2024 Dec 10.
6
BGSA: a bit-parallel global sequence alignment toolkit for multi-core and many-core architectures.BGSA:用于多核和众核架构的位并行全局序列比对工具包。
Bioinformatics. 2019 Jul 1;35(13):2306-2308. doi: 10.1093/bioinformatics/bty930.
7
Accelerating Minimap2 for Accurate Long Read Alignment on GPUs.在GPU上加速Minimap2以实现准确的长读长比对
J Biotechnol Biomed. 2023;6(1):13-23. doi: 10.26502/jbb.2642-91280067. Epub 2023 Jan 20.
8
Distributed large-scale graph processing on FPGAs.基于现场可编程门阵列(FPGA)的分布式大规模图形处理
J Big Data. 2023;10(1):95. doi: 10.1186/s40537-023-00756-x. Epub 2023 Jun 4.
9
Shifted Hamming distance: a fast and accurate SIMD-friendly filter to accelerate alignment verification in read mapping.移位汉明距离:一种快速且准确的便于单指令多数据(SIMD)处理的过滤器,用于加速读段映射中的比对验证。
Bioinformatics. 2015 May 15;31(10):1553-60. doi: 10.1093/bioinformatics/btu856. Epub 2015 Jan 10.
10
H-BLAST: a fast protein sequence alignment toolkit on heterogeneous computers with GPUs.H-BLAST:一种用于具有图形处理器的异构计算机的快速蛋白质序列比对工具包。
Bioinformatics. 2017 Apr 15;33(8):1130-1138. doi: 10.1093/bioinformatics/btw769.

引用本文的文献

1
A survey of sequence-to-graph mapping algorithms in the pangenome era.泛基因组时代序列到图谱映射算法综述。
Genome Biol. 2025 May 22;26(1):138. doi: 10.1186/s13059-025-03606-6.
2
FPGA-based accelerator for adaptive banded event alignment in nanopore sequencing data analysis.用于纳米孔测序数据分析中自适应带状事件对齐的基于现场可编程门阵列的加速器
BMC Bioinformatics. 2025 Mar 17;26(1):83. doi: 10.1186/s12859-024-06011-1.
3
QuickEd: high-performance exact sequence alignment based on bound-and-align.QuickEd:基于绑定与比对的高性能精确序列比对
Bioinformatics. 2025 Mar 4;41(3). doi: 10.1093/bioinformatics/btaf112.
4
Taming large-scale genomic analyses via sparsified genomics.通过稀疏化基因组学实现大规模基因组分析的优化
Nat Commun. 2025 Jan 21;16(1):876. doi: 10.1038/s41467-024-55762-1.
5
A framework for high-throughput sequence alignment using real processing-in-memory systems.基于真实处理内存储系统的高通量序列比对框架。
Bioinformatics. 2023 May 4;39(5). doi: 10.1093/bioinformatics/btad155.
6
Scrooge: a fast and memory-frugal genomic sequence aligner for CPUs, GPUs, and ASICs.Scrooge:一种用于 CPU、GPU 和 ASIC 的快速且节省内存的基因组序列比对器。
Bioinformatics. 2023 May 4;39(5). doi: 10.1093/bioinformatics/btad151.
7
BLEND: a fast, memory-efficient and accurate mechanism to find fuzzy seed matches in genome analysis.BLEND:一种在基因组分析中快速、节省内存且准确地查找模糊种子匹配项的机制。
NAR Genom Bioinform. 2023 Jan 20;5(1):lqad004. doi: 10.1093/nargab/lqad004. eCollection 2023 Mar.
8
Navigating bottlenecks and trade-offs in genomic data analysis.基因组数据分析中的瓶颈与权衡。
Nat Rev Genet. 2023 Apr;24(4):235-250. doi: 10.1038/s41576-022-00551-z. Epub 2022 Dec 7.
9
From molecules to genomic variations: Accelerating genome analysis via intelligent algorithms and architectures.从分子到基因组变异:通过智能算法和架构加速基因组分析
Comput Struct Biotechnol J. 2022 Aug 18;20:4579-4599. doi: 10.1016/j.csbj.2022.08.019. eCollection 2022.
10
Fast and memory-efficient mapping of short bisulfite sequencing reads using a two-letter alphabet.使用双字母字母表对短亚硫酸氢盐测序读数进行快速且内存高效的映射。
NAR Genom Bioinform. 2021 Dec 22;3(4):lqab115. doi: 10.1093/nargab/lqab115. eCollection 2021 Dec.