• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

RawHash2:基于哈希的种子生成和自适应量化的原始纳米孔信号映射。

RawHash2: mapping raw nanopore signals using hash-based seeding and adaptive quantization.

机构信息

Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich 8092, Switzerland.

出版信息

Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae478.

DOI:10.1093/bioinformatics/btae478
PMID:39078113
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11333567/
Abstract

SUMMARY

Raw nanopore signals can be analyzed while they are being generated, a process known as real-time analysis. Real-time analysis of raw signals is essential to utilize the unique features that nanopore sequencing provides, enabling the early stopping of the sequencing of a read or the entire sequencing run based on the analysis. The state-of-the-art mechanism, RawHash, offers the first hash-based efficient and accurate similarity identification between raw signals and a reference genome by quickly matching their hash values. In this work, we introduce RawHash2, which provides major improvements over RawHash, including more sensitive quantization and chaining algorithms, weighted mapping decisions, frequency filters to reduce ambiguous seed hits, minimizers for hash-based sketching, and support for the R10.4 flow cell version and POD5 and SLOW5 file formats. Compared to RawHash, RawHash2 provides better F1 accuracy (on average by 10.57% and up to 20.25%) and better throughput (on average by 4.0× and up to 9.9×) than RawHash.

AVAILABILITY AND IMPLEMENTATION

RawHash2 is available at https://github.com/CMU-SAFARI/RawHash. We also provide the scripts to fully reproduce our results on our GitHub page.

摘要

摘要

原始纳米孔信号可以在生成时进行分析,这一过程被称为实时分析。实时分析原始信号对于利用纳米孔测序提供的独特特征至关重要,它可以根据分析结果提前停止读取或整个测序运行。最先进的 RawHash 机制通过快速匹配其哈希值,为原始信号和参考基因组之间提供了基于哈希的高效和准确的相似性识别。在这项工作中,我们引入了 RawHash2,它相对于 RawHash 有了重大改进,包括更敏感的量化和链接算法、加权映射决策、减少模糊种子命中的频率滤波器、基于哈希的草图的最小化器,以及对 R10.4 流动池版本和 POD5 和 SLOW5 文件格式的支持。与 RawHash 相比,RawHash2 提供了更好的 F1 准确性(平均提高 10.57%,最高提高 20.25%)和更高的吞吐量(平均提高 4.0 倍,最高提高 9.9 倍)。

可用性和实现

RawHash2 可在 https://github.com/CMU-SAFARI/RawHash 上获得。我们还在 GitHub 页面上提供了完全重现我们结果的脚本。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b88/11333567/648f58223221/btae478f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b88/11333567/07fc47a05d79/btae478f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b88/11333567/648f58223221/btae478f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b88/11333567/07fc47a05d79/btae478f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b88/11333567/648f58223221/btae478f2.jpg

相似文献

1
RawHash2: mapping raw nanopore signals using hash-based seeding and adaptive quantization.RawHash2:基于哈希的种子生成和自适应量化的原始纳米孔信号映射。
Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae478.
2
RawHash: enabling fast and accurate real-time analysis of raw nanopore signals for large genomes.RawHash:实现对大型基因组原始纳米孔信号的快速、准确实时分析。
Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i297-i307. doi: 10.1093/bioinformatics/btad272.
3
Real-time mapping of nanopore raw signals.实时纳米孔原始信号映射。
Bioinformatics. 2021 Jul 12;37(Suppl_1):i477-i483. doi: 10.1093/bioinformatics/btab264.
4
Novel algorithms for efficient subsequence searching and mapping in nanopore raw signals towards targeted sequencing.针对靶向测序的新型算法,用于纳米孔原始信号中的高效子序列搜索和映射。
Bioinformatics. 2020 Mar 1;36(5):1333-1343. doi: 10.1093/bioinformatics/btz742.
5
Accelerated nanopore basecalling with SLOW5 data format.基于 SLOW5 数据格式的快速纳米孔碱基调用。
Bioinformatics. 2023 Jun 1;39(6). doi: 10.1093/bioinformatics/btad352.
6
Icarust, a real-time simulator for Oxford Nanopore adaptive sampling.Icarust,牛津纳米孔自适应采样的实时模拟器。
Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae141.
7
Closing the gap: Oxford Nanopore Technologies R10 sequencing allows comparable results to Illumina sequencing for SNP-based outbreak investigation of bacterial pathogens.缩小差距:牛津纳米孔技术 R10 测序能够与 Illumina 测序相媲美,可用于基于 SNP 的细菌病原体暴发调查。
J Clin Microbiol. 2024 May 8;62(5):e0157623. doi: 10.1128/jcm.01576-23. Epub 2024 Mar 5.
8
miniSNV: accurate and fast single nucleotide variant calling from nanopore sequencing data.miniSNV:从纳米孔测序数据中进行准确快速的单核苷酸变异calling。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae473.
9
Flexible and efficient handling of nanopore sequencing signal data with slow5tools.使用 slow5tools 灵活高效地处理纳米孔测序信号数据。
Genome Biol. 2023 Apr 6;24(1):69. doi: 10.1186/s13059-023-02910-3.
10
Streamlining remote nanopore data access with slow5curl.使用 slow5curl 简化远程 nanopore 数据访问。
Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giae016.

引用本文的文献

1
A new compression strategy to reduce the size of nanopore sequencing data.一种用于减小纳米孔测序数据大小的新压缩策略。
Genome Res. 2025 Jul 1;35(7):1574-1582. doi: 10.1101/gr.280090.124.
2
A Hitchhiker's Guide to long-read genomic analysis.长读长基因组分析指南
Genome Res. 2025 Apr 14;35(4):545-558. doi: 10.1101/gr.279975.124.
3
FPGA-based accelerator for adaptive banded event alignment in nanopore sequencing data analysis.用于纳米孔测序数据分析中自适应带状事件对齐的基于现场可编程门阵列的加速器

本文引用的文献

1
Sigmoni: classification of nanopore signal with a compressed pangenome index.西格蒙尼:使用压缩泛基因组索引对纳米孔信号进行分类。
Bioinformatics. 2024 Jun 28;40(Suppl 1):i287-i296. doi: 10.1093/bioinformatics/btae213.
2
Efficient real-time selective genome sequencing on resource-constrained devices.在资源受限的设备上进行高效实时的选择性基因组测序。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad046. Epub 2023 Jul 3.
3
RawHash: enabling fast and accurate real-time analysis of raw nanopore signals for large genomes.RawHash:实现对大型基因组原始纳米孔信号的快速、准确实时分析。
BMC Bioinformatics. 2025 Mar 17;26(1):83. doi: 10.1186/s12859-024-06011-1.
4
TargetCall: eliminating the wasted computation in basecalling via pre-basecalling filtering.目标调用:通过碱基识别前的过滤消除碱基识别中浪费的计算。
Front Genet. 2024 Oct 28;15:1429306. doi: 10.3389/fgene.2024.1429306. eCollection 2024.
Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i297-i307. doi: 10.1093/bioinformatics/btad272.
4
Coriolis: enabling metagenomic classification on lightweight mobile devices.科里奥利力:在轻量级移动设备上实现宏基因组分类。
Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i66-i75. doi: 10.1093/bioinformatics/btad243.
5
Rapid Real-time Squiggle Classification for Read until using RawMap.使用RawMap读取时的快速实时波形分类
Arch Clin Biomed Res. 2023;7(1):45-57. doi: 10.26502/acbr.50170318. Epub 2023 Jan 28.
6
DeepSelectNet: deep neural network based selective sequencing for oxford nanopore sequencing.DeepSelectNet:基于深度神经网络的牛津纳米孔测序选择性测序。
BMC Bioinformatics. 2023 Jan 28;24(1):31. doi: 10.1186/s12859-023-05151-0.
7
BLEND: a fast, memory-efficient and accurate mechanism to find fuzzy seed matches in genome analysis.BLEND:一种在基因组分析中快速、节省内存且准确地查找模糊种子匹配项的机制。
NAR Genom Bioinform. 2023 Jan 20;5(1):lqad004. doi: 10.1093/nargab/lqad004. eCollection 2023 Mar.
8
ReadBouncer: precise and scalable adaptive sampling for nanopore sequencing.ReadBouncer:适用于纳米孔测序的精确和可扩展自适应采样。
Bioinformatics. 2022 Jun 24;38(Suppl 1):i153-i160. doi: 10.1093/bioinformatics/btac223.
9
Fast nanopore sequencing data analysis with SLOW5.基于 SLOW5 的快速纳米孔测序数据分析。
Nat Biotechnol. 2022 Jul;40(7):1026-1029. doi: 10.1038/s41587-021-01147-4. Epub 2022 Jan 3.
10
SquiggleNet: real-time, direct classification of nanopore signals.SquiggleNet:实时直接对纳米孔信号进行分类。
Genome Biol. 2021 Oct 27;22(1):298. doi: 10.1186/s13059-021-02511-y.