Deepbinner：使用深度卷积神经网络对带有条形码的牛津纳米孔读取进行多路分解。

Deepbinner: Demultiplexing barcoded Oxford Nanopore reads with deep convolutional neural networks.

机构信息

Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Parkville, Victoria, Australia.

出版信息

PLoS Comput Biol. 2018 Nov 20;14(11):e1006583. doi: 10.1371/journal.pcbi.1006583. eCollection 2018 Nov.

DOI:10.1371/journal.pcbi.1006583

PMID:30458005

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6245502/

Abstract

Multiplexing, the simultaneous sequencing of multiple barcoded DNA samples on a single flow cell, has made Oxford Nanopore sequencing cost-effective for small genomes. However, it depends on the ability to sort the resulting sequencing reads by barcode, and current demultiplexing tools fail to classify many reads. Here we present Deepbinner, a tool for Oxford Nanopore demultiplexing that uses a deep neural network to classify reads based on the raw electrical read signal. This 'signal-space' approach allows for greater accuracy than existing 'base-space' tools (Albacore and Porechop) for which signals must first be converted to DNA base calls, itself a complex problem that can introduce noise into the barcode sequence. To assess Deepbinner and existing tools, we performed multiplex sequencing on 12 amplicons chosen for their distinguishability. This allowed us to establish a ground truth classification for each read based on internal sequence alone. Deepbinner had the lowest rate of unclassified reads (7.8%) and the highest demultiplexing precision (98.5% of classified reads were correctly assigned). It can be used alone (to maximise the number of classified reads) or in conjunction with other demultiplexers (to maximise precision and minimise false positive classifications). We also found cross-sample chimeric reads (0.3%) and evidence of barcode switching (0.3%) in our dataset, which likely arise during library preparation and may be detrimental for quantitative studies that use multiplexing. Deepbinner is open source (GPLv3) and available at https://github.com/rrwick/Deepbinner.

摘要

多重测序，即在单个流动池上同时对多个带有条形码的 DNA 样本进行测序，使牛津纳米孔测序在小基因组方面具有成本效益。然而，它依赖于通过条形码对产生的测序读取进行排序的能力，并且当前的多路分解工具无法对许多读取进行分类。在这里，我们提出了 Deepbinner，这是一种用于牛津纳米孔多路分解的工具，它使用深度神经网络根据原始电读信号对读取进行分类。这种“信号空间”方法比现有的“碱基空间”工具（Albacore 和 Porechop）具有更高的准确性，后者必须首先将信号转换为 DNA 碱基调用，而这本身就是一个复杂的问题，可能会给条形码序列带来噪声。为了评估 Deepbinner 和现有的工具，我们对 12 个扩增子进行了多重测序，这些扩增子因其可区分性而被选中。这使我们能够根据内部序列单独为每个读取建立一个真实的分类。Deepbinner 的未分类读取率最低（7.8%），多路分解精度最高（98.5%的分类读取被正确分配）。它可以单独使用（以最大化分类读取的数量），也可以与其他多路分解器一起使用（以最大化精度和最小化假阳性分类）。我们还在数据集（0.3%）中发现了跨样本嵌合读取和条形码切换的证据（0.3%），这可能在文库制备过程中产生，并且可能对使用多重化的定量研究有害。Deepbinner 是开源的（GPLv3），可在 https://github.com/rrwick/Deepbinner 上获得。

相似文献

Deepbinner: Demultiplexing barcoded Oxford Nanopore reads with deep convolutional neural networks.Deepbinner：使用深度卷积神经网络对带有条形码的牛津纳米孔读取进行多路分解。

PLoS Comput Biol. 2018 Nov 20;14(11):e1006583. doi: 10.1371/journal.pcbi.1006583. eCollection 2018 Nov.

Real-time demultiplexing Nanopore barcoded sequencing data with npBarcode.使用 npBarcode 对 Nanopore 条形码测序数据进行实时多重解析。

Bioinformatics. 2017 Dec 15;33(24):3988-3990. doi: 10.1093/bioinformatics/btx537.

HycDemux: a hybrid unsupervised approach for accurate barcoded sample demultiplexing in nanopore sequencing.HycDemux：一种用于纳米孔测序中准确进行带条码样本解复用的混合无监督方法。

Genome Biol. 2023 Oct 5;24(1):222. doi: 10.1186/s13059-023-03053-1.

Pheniqs 2.0: accurate, high-performance Bayesian decoding and confidence estimation for combinatorial barcode indexing.Pheniqs 2.0：用于组合条码索引的准确、高性能贝叶斯解码和置信度估计。

BMC Bioinformatics. 2021 Jul 2;22(1):359. doi: 10.1186/s12859-021-04267-5.

SACall: A Neural Network Basecaller for Oxford Nanopore Sequencing Data Based on Self-Attention Mechanism.SACall：基于自注意力机制的牛津纳米孔测序数据的神经网络碱基调用程序。

IEEE/ACM Trans Comput Biol Bioinform. 2022 Jan-Feb;19(1):614-623. doi: 10.1109/TCBB.2020.3039244. Epub 2022 Feb 3.

Enhancing the detection of barcoded reads in high throughput DNA sequencing data by controlling the false discovery rate.通过控制假发现率来提高高通量 DNA 测序数据中条码读取的检测能力。

BMC Bioinformatics. 2014 Aug 7;15(1):264. doi: 10.1186/1471-2105-15-264.

MinION™ nanopore sequencing of environmental metagenomes: a synthetic approach.环境宏基因组的MinION™纳米孔测序：一种合成方法。

Gigascience. 2017 Mar 1;6(3):1-10. doi: 10.1093/gigascience/gix007.

The use of Oxford Nanopore native barcoding for complete genome assembly.使用牛津纳米孔原生条形码技术进行全基因组组装。

Gigascience. 2017 Mar 1;6(3):1-6. doi: 10.1093/gigascience/gix001.

Real-time mapping of nanopore raw signals.实时纳米孔原始信号映射。

Bioinformatics. 2021 Jul 12;37(Suppl_1):i477-i483. doi: 10.1093/bioinformatics/btab264.

Benchmarking Long-Read Assemblers for Genomic Analyses of Bacterial Pathogens Using Oxford Nanopore Sequencing.基于 Oxford Nanopore 测序的细菌病原体基因组分析的长读长组装器基准测试

Int J Mol Sci. 2020 Dec 1;21(23):9161. doi: 10.3390/ijms21239161.

引用本文的文献

Characterization of a plasmid with antibiotic and stress resistance genes.一种具有抗生素和抗逆基因的质粒的特性分析。

Microb Genom. 2025 Jul;11(7). doi: 10.1099/mgen.0.001445.

Amplicon-Based MinION Sequencing Complements Severe Fever With Thrombocytopenia Syndrome (SFTS) Diagnosis via Real-Time RT-PCR in Patients With Suspected SFTS.基于扩增子的MinION测序通过实时逆转录聚合酶链反应（RT-PCR）补充疑似严重发热伴血小板减少综合征（SFTS）患者的SFTS诊断。

J Korean Med Sci. 2025 May 19;40(19):e69. doi: 10.3346/jkms.2025.40.e69.

CCS-Consensuser: A Haplotype-Aware Consensus Generator for PacBio Amplicon Sequences.CCS共识生成器：一种用于PacBio扩增子序列的单倍型感知共识生成器。

Mol Ecol Resour. 2025 Oct;25(7):e14113. doi: 10.1111/1755-0998.14113. Epub 2025 Apr 4.

Direct profiling of non-adenosines in poly(A) tails of endogenous and therapeutic mRNAs with Ninetails.使用九尾狐对内源和治疗性mRNA的聚腺苷酸尾中的非腺苷进行直接分析。

Nat Commun. 2025 Mar 18;16(1):2664. doi: 10.1038/s41467-025-57787-6.

DeepMAP: Deep CNN Classifiers Applied to Optical Mapping for Fast and Precise Species-Level Metagenomic Analysis.深度图谱（DeepMAP）：应用于光学图谱的深度卷积神经网络分类器，用于快速精确的物种水平宏基因组分析。

ACS Omega. 2025 Feb 27;10(9):9224-9232. doi: 10.1021/acsomega.4c09485. eCollection 2025 Mar 11.

Complex exchanges among plasmids and clonal expansion of lineages shape the population structure and virulence of .质粒间的复杂交换以及谱系的克隆扩增塑造了……的种群结构和毒力。（原文中“of”后面缺少具体内容）

bioRxiv. 2025 Jan 30:2025.01.29.635312. doi: 10.1101/2025.01.29.635312.

Rapid and accurate demultiplexing of direct RNA nanopore sequencing data with SeqTagger.使用SeqTagger对直接RNA纳米孔测序数据进行快速准确的解复用。

Genome Res. 2025 Apr 14;35(4):956-966. doi: 10.1101/gr.279290.124.

Genomic characteristics and genetic manipulation of the marine yeast Scheffersomyces spartinae.海洋酵母斯帕蒂纳毕赤酵母的基因组特征与遗传操作

Appl Microbiol Biotechnol. 2024 Dec 19;108(1):539. doi: 10.1007/s00253-024-13382-1.

TDFPS-Designer: an efficient toolkit for barcode design and selection in nanopore sequencing.TDFPS-Designer：一种用于纳米孔测序中条码设计和选择的高效工具包。

Genome Biol. 2024 Nov 4;25(1):285. doi: 10.1186/s13059-024-03423-3.

m6ATM: a deep learning framework for demystifying the m6A epitranscriptome with Nanopore long-read RNA-seq data.m6ATM：利用纳米孔长读 RNA-seq 数据解析 m6A 转录组奥秘的深度学习框架。

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae529.

本文引用的文献

Performance of neural network basecalling tools for Oxford Nanopore sequencing.基于神经网络的牛津纳米孔测序碱基调用工具的性能。

Genome Biol. 2019 Jun 24;20(1):129. doi: 10.1186/s13059-019-1727-y.

Minimap2: pairwise alignment for nucleotide sequences.Minimap2：核苷酸序列的两两比对。

Bioinformatics. 2018 Sep 15;34(18):3094-3100. doi: 10.1093/bioinformatics/bty191.

Chiron: translating nanopore raw signal directly into nucleotide sequence using deep learning.奇龙：利用深度学习将纳米孔原始信号直接转换为核苷酸序列。

Gigascience. 2018 May 1;7(5). doi: 10.1093/gigascience/giy037.

Completing bacterial genome assemblies with multiplex MinION sequencing.使用多重 MinION 测序完成细菌基因组组装。

Microb Genom. 2017 Sep 14;3(10):e000132. doi: 10.1099/mgen.0.000132. eCollection 2017 Oct.

Rapid de novo assembly of the European eel genome from nanopore sequencing reads.欧洲鳗鲡基因组从头快速组装来自纳米孔测序reads。

Sci Rep. 2017 Aug 3;7(1):7213. doi: 10.1038/s41598-017-07650-6.

Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads.单轮循环器：从短读长和长读长测序数据中解析细菌基因组组装结果

PLoS Comput Biol. 2017 Jun 8;13(6):e1005595. doi: 10.1371/journal.pcbi.1005595. eCollection 2017 Jun.

DeepNano: Deep recurrent neural networks for base calling in MinION nanopore reads.DeepNano：用于MinION纳米孔测序读数碱基识别的深度循环神经网络

PLoS One. 2017 Jun 5;12(6):e0178751. doi: 10.1371/journal.pone.0178751. eCollection 2017.

Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance.Edlib：一个使用编辑距离进行快速、精确序列比对的C/C++库。

Bioinformatics. 2017 May 1;33(9):1394-1395. doi: 10.1093/bioinformatics/btw753.

Real-time selective sequencing using nanopore technology.使用纳米孔技术的实时选择性测序。

Nat Methods. 2016 Sep;13(9):751-4. doi: 10.1038/nmeth.3930. Epub 2016 Jul 25.

A complete bacterial genome assembled de novo using only nanopore sequencing data.仅使用纳米孔测序数据从头组装完整的细菌基因组。

Nat Methods. 2015 Aug;12(8):733-5. doi: 10.1038/nmeth.3444. Epub 2015 Jun 15.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

Deepbinner：使用深度卷积神经网络对带有条形码的牛津纳米孔读取进行多路分解。

Deepbinner: Demultiplexing barcoded Oxford Nanopore reads with deep convolutional neural networks.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献