ReadBouncer：适用于纳米孔测序的精确和可扩展自适应采样。

ReadBouncer: precise and scalable adaptive sampling for nanopore sequencing.

机构信息

Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, 14482 Potsdam, Germany.

Bioinformatics Unit (MF1), Robert Koch Institute, 13353 Berlin, Germany.

出版信息

Bioinformatics. 2022 Jun 24;38(Suppl 1):i153-i160. doi: 10.1093/bioinformatics/btac223.

DOI:10.1093/bioinformatics/btac223

PMID:35758774

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9235500/

Abstract

MOTIVATION

Nanopore sequencers allow targeted sequencing of interesting nucleotide sequences by rejecting other sequences from individual pores. This feature facilitates the enrichment of low-abundant sequences by depleting overrepresented ones in-silico. Existing tools for adaptive sampling either apply signal alignment, which cannot handle human-sized reference sequences, or apply read mapping in sequence space relying on fast graphical processing units (GPU) base callers for real-time read rejection. Using nanopore long-read mapping tools is also not optimal when mapping shorter reads as usually analyzed in adaptive sampling applications.

RESULTS

Here, we present a new approach for nanopore adaptive sampling that combines fast CPU and GPU base calling with read classification based on Interleaved Bloom Filters. ReadBouncer improves the potential enrichment of low abundance sequences by its high read classification sensitivity and specificity, outperforming existing tools in the field. It robustly removes even reads belonging to large reference sequences while running on commodity hardware without GPUs, making adaptive sampling accessible for in-field researchers. Readbouncer also provides a user-friendly interface and installer files for end-users without a bioinformatics background.

AVAILABILITY AND IMPLEMENTATION

The C++ source code is available at https://gitlab.com/dacs-hpi/readbouncer.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

纳米孔测序仪允许通过拒绝个体孔中的其他序列来靶向测序感兴趣的核苷酸序列。该特性通过在计算机上耗尽过表达的序列，从而促进低丰度序列的富集。现有的自适应采样工具要么应用信号对齐，而该方法无法处理人类大小的参考序列，要么在序列空间中应用基于快速图形处理单元（GPU）碱基调用器的读映射进行实时读丢弃。当映射较短的读段（通常在自适应采样应用中分析）时，使用纳米孔长读段映射工具也不是最优的。

结果

本文提出了一种新的纳米孔自适应采样方法，该方法结合了快速 CPU 和 GPU 碱基调用以及基于交错布隆过滤器的读分类。ReadBouncer 通过高读分类灵敏度和特异性提高了低丰度序列的潜在富集效果，优于该领域中的现有工具。它在没有 GPU 的商用硬件上运行时，能够稳健地去除甚至属于大型参考序列的读段，使自适应采样可供现场研究人员使用。Readbouncer 还为没有生物信息学背景的终端用户提供了用户友好的界面和安装程序文件。

可用性和实现

C++源代码可在 https://gitlab.com/dacs-hpi/readbouncer 上获得。

补充信息

补充数据可在 Bioinformatics 在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d633/9235500/01ea89347c1c/btac223f1.jpg

相似文献

ReadBouncer: precise and scalable adaptive sampling for nanopore sequencing.ReadBouncer：适用于纳米孔测序的精确和可扩展自适应采样。

Bioinformatics. 2022 Jun 24;38(Suppl 1):i153-i160. doi: 10.1093/bioinformatics/btac223.

Icarust, a real-time simulator for Oxford Nanopore adaptive sampling.Icarust，牛津纳米孔自适应采样的实时模拟器。

Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae141.

NanoGalaxy: Nanopore long-read sequencing data analysis in Galaxy.NanoGalaxy：Galaxy 中的纳米孔长读测序数据分析。

Gigascience. 2020 Oct 17;9(10). doi: 10.1093/gigascience/giaa105.

PyPore: a python toolbox for nanopore sequencing data handling.PyPore：一个用于纳米孔测序数据处理的 Python 工具包。

Bioinformatics. 2019 Nov 1;35(21):4445-4447. doi: 10.1093/bioinformatics/btz269.

DeepSimulator1.5: a more powerful, quicker and lighter simulator for Nanopore sequencing.DeepSimulator1.5：一款更强大、更快速、更轻量级的纳米孔测序模拟软件。

Bioinformatics. 2020 Apr 15;36(8):2578-2580. doi: 10.1093/bioinformatics/btz963.

QAlign: aligning nanopore reads accurately using current-level modeling.QAlign：使用电流水平建模准确对齐纳米孔读数。

Bioinformatics. 2021 May 5;37(5):625-633. doi: 10.1093/bioinformatics/btaa875.

Real-time mapping of nanopore raw signals.实时纳米孔原始信号映射。

Bioinformatics. 2021 Jul 12;37(Suppl_1):i477-i483. doi: 10.1093/bioinformatics/btab264.

BulkVis: a graphical viewer for Oxford nanopore bulk FAST5 files.BulkVis：用于牛津纳米孔批量 FAST5 文件的图形查看器。

Bioinformatics. 2019 Jul 1;35(13):2193-2198. doi: 10.1093/bioinformatics/bty841.

NanoSNP: a progressive and haplotype-aware SNP caller on low-coverage nanopore sequencing data.NanoSNP：一种针对低覆盖度纳米孔测序数据的渐进式、单体型感知 SNP 调用程序。

Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac824.

NanoCLUST: a species-level analysis of 16S rRNA nanopore sequencing data.NanoCLUST：基于 16S rRNA 纳米孔测序数据的种水平分析。

Bioinformatics. 2021 Jul 12;37(11):1600-1601. doi: 10.1093/bioinformatics/btaa900.

引用本文的文献

Impact of microbiological molecular methodologies on adaptive sampling using nanopore sequencing in metagenomic studies.微生物分子方法对宏基因组研究中使用纳米孔测序进行适应性采样的影响。

Environ Microbiome. 2025 May 5;20(1):47. doi: 10.1186/s40793-025-00704-7.

Nanopore adaptive sampling to identify the NLR gene family in melon (Cucumis melo L.).用于鉴定甜瓜（Cucumis melo L.）中NLR基因家族的纳米孔自适应采样

BMC Genomics. 2025 Feb 10;26(1):126. doi: 10.1186/s12864-025-11295-5.

Nanopore sequencing of protozoa: Decoding biological information on a string of biochemical molecules into human-readable signals.原生动物的纳米孔测序：将一串生化分子上的生物信息解码为人类可读信号。

Comput Struct Biotechnol J. 2025 Jan 6;27:440-450. doi: 10.1016/j.csbj.2025.01.002. eCollection 2025.

Real-time and programmable transcriptome sequencing with PROFIT-seq.利用PROFIT-seq进行实时可编程转录组测序。

Nat Cell Biol. 2024 Dec;26(12):2183-2194. doi: 10.1038/s41556-024-01537-1. Epub 2024 Oct 23.

ReadCurrent: a VDCNN-based tool for fast and accurate nanopore selective sequencing.ReadCurrent：一种基于 VDCNN 的快速准确的纳米孔靶向测序工具。

Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae435.

A case of an Angelman-syndrome caused by an intragenic duplication of UBE3A uncovered by adaptive nanopore sequencing.一例 Angelman 综合征由 UBE3A 基因内重复引起，通过自适应纳米孔测序揭示。

Clin Epigenetics. 2024 Aug 2;16(1):101. doi: 10.1186/s13148-024-01711-0.

RawHash2: mapping raw nanopore signals using hash-based seeding and adaptive quantization.RawHash2：基于哈希的种子生成和自适应量化的原始纳米孔信号映射。

Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae478.

Advances in Host Depletion and Pathogen Enrichment Methods for Rapid Sequencing-Based Diagnosis of Bloodstream Infection.基于高通量测序的血流感染快速诊断中宿主细胞去除和病原体富集方法的研究进展。

J Mol Diagn. 2024 Sep;26(9):741-753. doi: 10.1016/j.jmoldx.2024.05.008. Epub 2024 Jun 24.

Fast and space-efficient taxonomic classification of long reads with hierarchical interleaved XOR filters.基于分层交错异或过滤器的长读快速且节省空间的分类学分类。

Genome Res. 2024 Jul 23;34(6):914-924. doi: 10.1101/gr.278623.123.

Biochemical-free enrichment or depletion of RNA classes in real-time during direct RNA sequencing with RISER.在直接 RNA 测序过程中，使用 RISER 实时进行无生化的 RNA 类别的富集或耗尽。

Nat Commun. 2024 May 24;15(1):4422. doi: 10.1038/s41467-024-48673-8.

本文引用的文献

The complete sequence of a human genome.人类基因组的完整序列。

Science. 2022 Apr;376(6588):44-53. doi: 10.1126/science.abj6987. Epub 2022 Mar 31.

Evaluation of microbiome enrichment and host DNA depletion in human vaginal samples using Oxford Nanopore's adaptive sequencing.利用牛津纳米孔公司的自适应测序技术评估人阴道样本中的微生物组富集和宿主 DNA 耗竭。

Sci Rep. 2022 Mar 7;12(1):4000. doi: 10.1038/s41598-022-08003-8.

The Statistics of -mers from a Sequence Undergoing a Simple Mutation Process Without Spurious Matches.无伪匹配情况下简单突变过程中序列的 -mers 统计。

J Comput Biol. 2022 Feb;29(2):155-168. doi: 10.1089/cmb.2021.0431. Epub 2022 Feb 1.

Nanopore adaptive sampling: a tool for enrichment of low abundance species in metagenomic samples.纳米孔自适应采样：一种用于宏基因组样本中低丰度物种富集的工具。

Genome Biol. 2022 Jan 24;23(1):11. doi: 10.1186/s13059-021-02582-x.

Multiple rereads of single proteins at single-amino acid resolution using nanopores.使用纳米孔技术对单个蛋白质进行多次单氨基酸分辨率重读。

Science. 2021 Dec 17;374(6574):1509-1513. doi: 10.1126/science.abl4381. Epub 2021 Nov 4.

Pan-genomic matching statistics for targeted nanopore sequencing.靶向纳米孔测序的泛基因组匹配统计

iScience. 2021 Jun 8;24(6):102696. doi: 10.1016/j.isci.2021.102696. eCollection 2021 Jun 25.

Comprehensive Pathogen Identification, Antibiotic Resistance, and Virulence Genes Prediction Directly From Simulated Blood Samples and Positive Blood Cultures by Nanopore Metagenomic Sequencing.通过纳米孔宏基因组测序直接从模拟血样和阳性血培养物中进行全面的病原体鉴定、抗生素耐药性和毒力基因预测。

Front Genet. 2021 Mar 24;12:620009. doi: 10.3389/fgene.2021.620009. eCollection 2021.

Readfish enables targeted nanopore sequencing of gigabase-sized genomes.读鱼技术可实现针对 gigabase 大小基因组的靶向纳米孔测序。

Nat Biotechnol. 2021 Apr;39(4):442-450. doi: 10.1038/s41587-020-00746-x. Epub 2020 Nov 30.

Targeted nanopore sequencing by real-time mapping of raw electrical signal with UNCALLED.利用 UNCALLED 对原始电信号进行实时映射的靶向纳米孔测序。

Nat Biotechnol. 2021 Apr;39(4):431-441. doi: 10.1038/s41587-020-0731-9. Epub 2020 Nov 30.

PBSIM2: a simulator for long-read sequencers with a novel generative model of quality scores.PBSIM2：一种带有新型质量评分生成模型的长读测序模拟软件。

Bioinformatics. 2021 May 5;37(5):589-595. doi: 10.1093/bioinformatics/btaa835.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

ReadBouncer：适用于纳米孔测序的精确和可扩展自适应采样。

ReadBouncer: precise and scalable adaptive sampling for nanopore sequencing.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献