Suppr超能文献

SimReadUntil 用于在ONT 设备上对选择性测序算法进行基准测试。

SimReadUntil for benchmarking selective sequencing algorithms on ONT devices.

机构信息

Biomedical Informatics Group, Department of Computer Science, ETH Zurich, Zürich, 8092, Switzerland.

Empirical Inference, Max Planck Institute for Intelligent Systems, Tübingen, 72076, Germany.

出版信息

Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae199.

Abstract

MOTIVATION

The Oxford Nanopore Technologies (ONT) ReadUntil API enables selective sequencing, which aims to selectively favor interesting over uninteresting reads, e.g. to deplete or enrich certain genomic regions. The performance gain depends on the selective sequencing decision-making algorithm (SSDA) which decides whether to reject a read, stop receiving a read, or wait for more data. Since real runs are time-consuming and costly, simulating the ONT sequencer with support for the ReadUntil API is highly beneficial for comparing and optimizing new SSDAs. Existing software like MinKNOW and UNCALLED only return raw signal data, are memory-intensive, require huge and often unavailable multi-fast5 files (≥100GB) and are not clearly documented.

RESULTS

We present the ONT device simulator SimReadUntil that takes a set of full reads as input, distributes them to channels and plays them back in real time including mux scans, channel gaps and blockages, and allows to reject reads as well as stop receiving data from them. Our modified ReadUntil API provides the basecalled reads rather than the raw signal, reducing computational load and focusing on the SSDA rather than on basecalling. Tuning the parameters of tools like ReadFish and ReadBouncer becomes easier because a GPU for basecalling is no longer required. We offer various methods to extract simulation parameters from a sequencing summary file and adapt ReadFish to replicate one of their enrichment experiments. SimReadUntil's gRPC interface allows standardized interaction with a wide range of programming languages.

AVAILABILITY AND IMPLEMENTATION

Code and fully worked examples are available on GitHub (https://github.com/ratschlab/sim_read_until).

摘要

动机

牛津纳米孔技术(ONT)的 ReadUntil API 支持选择性测序,旨在有选择地优先考虑有趣的读取而不是无趣的读取,例如耗尽或富集某些基因组区域。性能增益取决于选择性测序决策算法(SSDA),该算法决定是拒绝读取、停止接收读取还是等待更多数据。由于实际运行既耗时又昂贵,因此使用支持 ReadUntil API 的 ONT 测序仪进行模拟对于比较和优化新的 SSDAs 非常有益。现有的软件,如 MinKNOW 和 UNCALLED,仅返回原始信号数据,内存密集型,需要庞大且通常不可用的多 fast5 文件(≥100GB),并且文档不清晰。

结果

我们提出了 ONT 设备模拟器 SimReadUntil,它将一组完整的读取作为输入,将它们分配到通道中,并实时播放,包括多路复用扫描、通道间隙和阻塞,并允许拒绝读取以及停止从它们接收数据。我们修改后的 ReadUntil API 提供了碱基调用读取,而不是原始信号,从而减少了计算负载,并将重点放在 SSDA 上,而不是碱基调用上。调整 ReadFish 和 ReadBouncer 等工具的参数变得更加容易,因为不再需要用于碱基调用的 GPU。我们提供了从测序摘要文件中提取模拟参数的各种方法,并调整了 ReadFish 以复制他们的一个富集实验。SimReadUntil 的 gRPC 接口允许与广泛的编程语言进行标准化交互。

可用性和实现

代码和完整的工作示例可在 GitHub 上获得(https://github.com/ratschlab/sim_read_until)。

相似文献

引用本文的文献

1
Nanopore adaptive sampling effectively enriches bacterial plasmids.纳米孔自适应采样有效地富集了细菌质粒。
mSystems. 2024 Mar 19;9(3):e0094523. doi: 10.1128/msystems.00945-23. Epub 2024 Feb 20.

本文引用的文献

7
Minimap2: pairwise alignment for nucleotide sequences.Minimap2:核苷酸序列的两两比对。
Bioinformatics. 2018 Sep 15;34(18):3094-3100. doi: 10.1093/bioinformatics/bty191.
8
DeepSimulator: a deep simulator for Nanopore sequencing.深模拟器:一种用于纳米孔测序的深度模拟器。
Bioinformatics. 2018 Sep 1;34(17):2899-2908. doi: 10.1093/bioinformatics/bty223.
10
Real-time selective sequencing using nanopore technology.使用纳米孔技术的实时选择性测序。
Nat Methods. 2016 Sep;13(9):751-4. doi: 10.1038/nmeth.3930. Epub 2016 Jul 25.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验