PRONAME：一个用户友好型流程，通过生成高质量的一致性序列来处理长读长纳米孔宏条形码数据。

PRONAME: a user-friendly pipeline to process long-read nanopore metabarcoding data by generating high-quality consensus sequences.

作者信息

Dubois Benjamin, Delitte Mathieu, Lengrand Salomé, Bragard Claude, Legrève Anne, Debode Frédéric

机构信息

Bioengineering Unit, Life Sciences Department, Walloon Agricultural Research Centre, Gembloux, Belgium.

Earth and Life Institute - Applied Microbiology, Plant Health, UCLouvain, Louvain-la-Neuve, Belgium.

出版信息

Front Bioinform. 2024 Dec 20;4:1483255. doi: 10.3389/fbinf.2024.1483255. eCollection 2024.

DOI:10.3389/fbinf.2024.1483255

PMID:39758955

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11695402/

Abstract

BACKGROUND

The study of sample taxonomic composition has evolved from direct observations and labor-intensive morphological studies to different DNA sequencing methodologies. Most of these studies leverage the metabarcoding approach, which involves the amplification of a small taxonomically-informative portion of the genome and its subsequent high-throughput sequencing. Recent advances in sequencing technology brought by Oxford Nanopore Technologies have revolutionized the field, enabling portability, affordable cost and long-read sequencing, therefore leading to a significant increase in taxonomic resolution. However, Nanopore sequencing data exhibit a particular profile, with a higher error rate compared with Illumina sequencing, and existing bioinformatics pipelines for the analysis of such data are scarce and often insufficient, requiring specialized tools to accurately process long-read sequences.

RESULTS

We present PRONAME (PROcessing NAnopore MEtabarcoding data), an open-source, user-friendly pipeline optimized for processing raw Nanopore sequencing data. PRONAME includes precompiled databases for complete 16S sequences (Silva138 and Greengenes2) and a newly developed and curated database dedicated to bacterial 16S-ITS-23S operon sequences. The user can also provide a custom database if desired, therefore enabling the analysis of metabarcoding data for any domain of life. The pipeline significantly improves sequence accuracy, implementing innovative error-correction strategies and taking advantage of the new sequencing chemistry to produce high-quality duplex reads. Evaluations using a mock community have shown that PRONAME delivers consensus sequences demonstrating at least 99.5% accuracy with standard settings (and up to 99.7%), making it a robust tool for genomic analysis of complex multi-species communities.

CONCLUSION

PRONAME meets the challenges of long-read Nanopore data processing, offering greater accuracy and versatility than existing pipelines. By integrating Nanopore-specific quality filtering, clustering and error correction, PRONAME produces high-precision consensus sequences. This brings the accuracy of Nanopore sequencing close to that of Illumina sequencing, while taking advantage of the benefits of long-read technologies.

摘要

背景

样本分类组成的研究已从直接观察和劳动密集型的形态学研究发展到不同的DNA测序方法。这些研究大多采用宏条形码方法，该方法涉及对基因组中一小部分具有分类学信息的片段进行扩增，随后进行高通量测序。牛津纳米孔技术带来的测序技术最新进展彻底改变了该领域，实现了便携性、可承受的成本和长读长测序，从而显著提高了分类分辨率。然而，纳米孔测序数据呈现出一种特殊的特征，与Illumina测序相比错误率更高，并且用于分析此类数据的现有生物信息学流程稀缺且往往不足，需要专门的工具来准确处理长读长序列。

结果

我们展示了PRONAME（处理纳米孔宏条形码数据），这是一个为处理原始纳米孔测序数据而优化的开源、用户友好的流程。PRONAME包括用于完整16S序列的预编译数据库（Silva138和Greengenes2）以及一个新开发和整理的专门用于细菌16S - ITS - 23S操纵子序列的数据库。如果需要，用户还可以提供自定义数据库，从而能够分析任何生命领域的宏条形码数据。该流程显著提高了序列准确性，实施了创新的纠错策略，并利用新的测序化学方法生成高质量的双链读数。使用模拟群落进行的评估表明，PRONAME在标准设置下可提供准确率至少为99.5%（最高可达99.7%）的一致序列，使其成为复杂多物种群落基因组分析的强大工具。

结论

PRONAME应对了长读长纳米孔数据处理的挑战，比现有流程具有更高的准确性和通用性。通过整合纳米孔特定的质量过滤、聚类和纠错功能，PRONAME生成高精度的一致序列。这使得纳米孔测序的准确性接近Illumina测序，同时利用了长读长技术的优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68f6/11695402/7fb6fcf3573e/fbinf-04-1483255-g001.jpg

相似文献

PRONAME: a user-friendly pipeline to process long-read nanopore metabarcoding data by generating high-quality consensus sequences.PRONAME：一个用户友好型流程，通过生成高质量的一致性序列来处理长读长纳米孔宏条形码数据。

Front Bioinform. 2024 Dec 20;4:1483255. doi: 10.3389/fbinf.2024.1483255. eCollection 2024.

Microbial Identification Using rRNA Operon Region: Database and Tool for Metataxonomics with Long-Read Sequence.基于 rRNA 操纵子区域的微生物鉴定：长读序列宏基因组学的数据库和工具。

Microbiol Spectr. 2022 Apr 27;10(2):e0201721. doi: 10.1128/spectrum.02017-21. Epub 2022 Mar 30.

Primed and ready: nanopore metabarcoding can now recover highly accurate consensus barcodes that are generally indel-free.准备就绪：纳米孔代谢组条形码现在可以恢复高度准确的共识条形码，通常无插入/缺失。

BMC Genomics. 2024 Sep 9;25(1):842. doi: 10.1186/s12864-024-10767-4.

High accuracy meets high throughput for near full-length 16S ribosomal RNA amplicon sequencing on the Nanopore platform.在纳米孔平台上进行近全长16S核糖体RNA扩增子测序时，高精度与高通量得以兼顾。

PNAS Nexus. 2024 Oct 9;3(10):pgae411. doi: 10.1093/pnasnexus/pgae411. eCollection 2024 Oct.

Polishing the Oxford Nanopore long-read assemblies of bacterial pathogens with Illumina short reads to improve genomic analyses.用 Illumina 短读序列对牛津纳米孔长读序列组装的细菌病原体进行打磨，以改进基因组分析。

Genomics. 2021 May;113(3):1366-1377. doi: 10.1016/j.ygeno.2021.03.018. Epub 2021 Mar 11.

MinION™ nanopore sequencing of environmental metagenomes: a synthetic approach.环境宏基因组的MinION™纳米孔测序：一种合成方法。

Gigascience. 2017 Mar 1;6(3):1-10. doi: 10.1093/gigascience/gix007.

Complete pipeline for Oxford Nanopore Technology amplicon sequencing (ONT-AmpSeq): from pre-processing to creating an operational taxonomic unit table.牛津纳米孔技术扩增子测序（ONT-AmpSeq）的完整流程：从预处理到创建操作分类单元表。

FEBS Open Bio. 2024 Nov;14(11):1779-1787. doi: 10.1002/2211-5463.13868. Epub 2024 Aug 7.

A novel barcoded nanopore sequencing workflow of high-quality, full-length bacterial 16S amplicons for taxonomic annotation of bacterial isolates and complex microbial communities.一种新型的条形码纳米孔测序工作流程，可实现高质量、全长细菌 16S 扩增子的分类注释，用于细菌分离物和复杂微生物群落的分类注释。

mSystems. 2024 Oct 22;9(10):e0085924. doi: 10.1128/msystems.00859-24. Epub 2024 Sep 10.

RESCUE: a validated Nanopore pipeline to classify bacteria through long-read, 16S-ITS-23S rRNA sequencing.RESCUE：一种经过验证的纳米孔流程，通过长读长16S-ITS-23S rRNA测序对细菌进行分类。

Front Microbiol. 2023 Jul 20;14:1201064. doi: 10.3389/fmicb.2023.1201064. eCollection 2023.

NGSpeciesID: DNA barcode and amplicon consensus generation from long-read sequencing data.NGSpeciesID：从长读长测序数据生成DNA条形码和扩增子共识序列。

Ecol Evol. 2021 Jan 11;11(3):1392-1398. doi: 10.1002/ece3.7146. eCollection 2021 Feb.

引用本文的文献

Functional replacement of ancestral antibacterial secretion system in a bacterial plant pathogen.细菌植物病原体中祖先抗菌分泌系统的功能替代

Nat Ecol Evol. 2025 Jul 4. doi: 10.1038/s41559-025-02773-w.

本文引用的文献

Humic substances increase tomato tolerance to osmotic stress while modulating vertically transmitted endophytic bacterial communities.腐殖质在调节垂直传播的内生细菌群落的同时，提高了番茄对渗透胁迫的耐受性。

Front Plant Sci. 2024 Nov 19;15:1488671. doi: 10.3389/fpls.2024.1488671. eCollection 2024.

GROND: a quality-checked and publicly available database of full-length 16S-ITS-23S rRNA operon sequences.GROND：一个经过质量检查和公开可用的全长 16S-ITS-23S rRNA 操纵子序列数据库。

Microb Genom. 2024 Jun;10(6). doi: 10.1099/mgen.0.001255.

Time- and memory-efficient genome assembly with Raven.使用Raven进行高效省时的基因组组装。

Nat Comput Sci. 2021 May;1(5):332-336. doi: 10.1038/s43588-021-00073-4. Epub 2021 May 20.

Front Microbiol. 2023 Jul 20;14:1201064. doi: 10.3389/fmicb.2023.1201064. eCollection 2023.

Greengenes2 unifies microbial data in a single reference tree.Greengenes2 将微生物数据统一在一个单一的参考树中。

Nat Biotechnol. 2024 May;42(5):715-718. doi: 10.1038/s41587-023-01845-1. Epub 2023 Jul 27.

A ribosomal operon database and MegaBLAST settings for strain-level resolution of microbiomes.一个用于微生物群落菌株水平解析的核糖体操纵子数据库和MegaBLAST设置。

FEMS Microbes. 2022 Jan 27;3:xtac002. doi: 10.1093/femsmc/xtac002. eCollection 2022.

Using nanopore sequencing to identify fungi from clinical samples with high phylogenetic resolution.使用纳米孔测序技术以高系统发育分辨率鉴定临床样本中的真菌。

Sci Rep. 2023 Jun 16;13(1):9785. doi: 10.1038/s41598-023-37016-0.

Oxford nanopore long-read sequencing enables the generation of complete bacterial and plasmid genomes without short-read sequencing.牛津纳米孔长读长测序技术无需短读长测序即可生成完整的细菌和质粒基因组。

Front Microbiol. 2023 May 15;14:1179966. doi: 10.3389/fmicb.2023.1179966. eCollection 2023.

Nanopore Is Preferable over Illumina for 16S Amplicon Sequencing of the Gut Microbiota When Species-Level Taxonomic Classification, Accurate Estimation of Richness, or Focus on Rare Taxa Is Required.当需要进行物种水平的分类、准确估计丰富度或关注稀有分类群时，对于肠道微生物群的16S扩增子测序，纳米孔测序比Illumina测序更具优势。

Microorganisms. 2023 Mar 21;11(3):804. doi: 10.3390/microorganisms11030804.

Complete sequence verification of plasmid DNA using the Oxford Nanopore Technologies' MinION device.使用 Oxford Nanopore Technologies' MinION 设备对质粒 DNA 进行完整序列验证。

BMC Bioinformatics. 2023 Mar 24;24(1):116. doi: 10.1186/s12859-023-05226-y.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

PRONAME：一个用户友好型流程，通过生成高质量的一致性序列来处理长读长纳米孔宏条形码数据。

PRONAME: a user-friendly pipeline to process long-read nanopore metabarcoding data by generating high-quality consensus sequences.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献