Suppr超能文献

ReadBouncer:适用于纳米孔测序的精确和可扩展自适应采样。

ReadBouncer: precise and scalable adaptive sampling for nanopore sequencing.

机构信息

Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, 14482 Potsdam, Germany.

Bioinformatics Unit (MF1), Robert Koch Institute, 13353 Berlin, Germany.

出版信息

Bioinformatics. 2022 Jun 24;38(Suppl 1):i153-i160. doi: 10.1093/bioinformatics/btac223.

Abstract

MOTIVATION

Nanopore sequencers allow targeted sequencing of interesting nucleotide sequences by rejecting other sequences from individual pores. This feature facilitates the enrichment of low-abundant sequences by depleting overrepresented ones in-silico. Existing tools for adaptive sampling either apply signal alignment, which cannot handle human-sized reference sequences, or apply read mapping in sequence space relying on fast graphical processing units (GPU) base callers for real-time read rejection. Using nanopore long-read mapping tools is also not optimal when mapping shorter reads as usually analyzed in adaptive sampling applications.

RESULTS

Here, we present a new approach for nanopore adaptive sampling that combines fast CPU and GPU base calling with read classification based on Interleaved Bloom Filters. ReadBouncer improves the potential enrichment of low abundance sequences by its high read classification sensitivity and specificity, outperforming existing tools in the field. It robustly removes even reads belonging to large reference sequences while running on commodity hardware without GPUs, making adaptive sampling accessible for in-field researchers. Readbouncer also provides a user-friendly interface and installer files for end-users without a bioinformatics background.

AVAILABILITY AND IMPLEMENTATION

The C++ source code is available at https://gitlab.com/dacs-hpi/readbouncer.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

纳米孔测序仪允许通过拒绝个体孔中的其他序列来靶向测序感兴趣的核苷酸序列。该特性通过在计算机上耗尽过表达的序列,从而促进低丰度序列的富集。现有的自适应采样工具要么应用信号对齐,而该方法无法处理人类大小的参考序列,要么在序列空间中应用基于快速图形处理单元(GPU)碱基调用器的读映射进行实时读丢弃。当映射较短的读段(通常在自适应采样应用中分析)时,使用纳米孔长读段映射工具也不是最优的。

结果

本文提出了一种新的纳米孔自适应采样方法,该方法结合了快速 CPU 和 GPU 碱基调用以及基于交错布隆过滤器的读分类。ReadBouncer 通过高读分类灵敏度和特异性提高了低丰度序列的潜在富集效果,优于该领域中的现有工具。它在没有 GPU 的商用硬件上运行时,能够稳健地去除甚至属于大型参考序列的读段,使自适应采样可供现场研究人员使用。Readbouncer 还为没有生物信息学背景的终端用户提供了用户友好的界面和安装程序文件。

可用性和实现

C++源代码可在 https://gitlab.com/dacs-hpi/readbouncer 上获得。

补充信息

补充数据可在 Bioinformatics 在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d633/9235500/01ea89347c1c/btac223f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验