dada 蛇，一个 DADA2 的 Snakemake 实现，用于处理微生物生态学的扩增子测序数据。

Dadasnake, a Snakemake implementation of DADA2 to process amplicon sequencing data for microbial ecology.

机构信息

Helmholtz Centre for Environmental Research GmbH - UFZ, Department of Soil Ecology; Theodor-Lieser-Str. 4, 06120 Halle, Germany.

German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Metagenomics Support Unit; Puschstr. 4, 04103 Leipzig, Germany.

出版信息

Gigascience. 2020 Nov 30;9(12). doi: 10.1093/gigascience/giaa135.

DOI:10.1093/gigascience/giaa135

PMID:33252655

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7702218/

Abstract

BACKGROUND

Amplicon sequencing of phylogenetic marker genes, e.g., 16S, 18S, or ITS ribosomal RNA sequences, is still the most commonly used method to determine the composition of microbial communities. Microbial ecologists often have expert knowledge on their biological question and data analysis in general, and most research institutes have computational infrastructures to use the bioinformatics command line tools and workflows for amplicon sequencing analysis, but requirements of bioinformatics skills often limit the efficient and up-to-date use of computational resources.

RESULTS

We present dadasnake, a user-friendly, 1-command Snakemake pipeline that wraps the preprocessing of sequencing reads and the delineation of exact sequence variants by using the favorably benchmarked and widely used DADA2 algorithm with a taxonomic classification and the post-processing of the resultant tables, including hand-off in standard formats. The suitability of the provided default configurations is demonstrated using mock community data from bacteria and archaea, as well as fungi.

CONCLUSIONS

By use of Snakemake, dadasnake makes efficient use of high-performance computing infrastructures. Easy user configuration guarantees flexibility of all steps, including the processing of data from multiple sequencing platforms. It is easy to install dadasnake via conda environments. dadasnake is available at https://github.com/a-h-b/dadasnake.

摘要

背景

扩增子测序的系统发育标记基因，例如 16S、18S 或 ITS 核糖体 RNA 序列，仍然是确定微生物群落组成的最常用方法。微生物生态学家通常对他们的生物学问题和数据分析有专业知识，并且大多数研究机构都有计算基础设施，可以使用生物信息学命令行工具和工作流程进行扩增子测序分析，但对生物信息学技能的要求通常限制了计算资源的有效和最新利用。

结果

我们提出了 dadasnake，这是一个用户友好的、单命令 Snakemake 管道，它使用经过有利基准测试和广泛使用的 DADA2 算法来预处理测序reads，并通过分类学分类和对结果表进行后处理（包括以标准格式进行交接）来划定精确的序列变体。使用来自细菌和古菌以及真菌的模拟群落数据证明了所提供的默认配置的适用性。

结论

通过使用 Snakemake，dadasnake 可以有效地利用高性能计算基础设施。易于用户配置可确保所有步骤的灵活性，包括来自多个测序平台的数据处理。通过 conda 环境可以轻松安装 dadasnake。dadasnake 可在 https://github.com/a-h-b/dadasnake 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9dcd/7702218/9b73c78144c5/giaa135fig1.jpg

相似文献

Dadasnake, a Snakemake implementation of DADA2 to process amplicon sequencing data for microbial ecology.

Gigascience. 2020 Nov 30;9(12). doi: 10.1093/gigascience/giaa135.

Natrix: a Snakemake-based workflow for processing, clustering, and taxonomically assigning amplicon sequencing reads.

BMC Bioinformatics. 2020 Nov 16;21(1):526. doi: 10.1186/s12859-020-03852-4.

iMAP: an integrated bioinformatics and visualization pipeline for microbiome data analysis.

BMC Bioinformatics. 2019 Jul 3;20(1):374. doi: 10.1186/s12859-019-2965-4.

CDSnake: Snakemake pipeline for retrieval of annotated OTUs from paired-end reads using CD-HIT utilities.

BMC Bioinformatics. 2020 Jul 24;21(Suppl 12):303. doi: 10.1186/s12859-020-03591-6.

Dadaist2: A Toolkit to Automate and Simplify Statistical Analysis and Plotting of Metabarcoding Experiments.

Int J Mol Sci. 2021 May 18;22(10):5309. doi: 10.3390/ijms22105309.

ASAP 2: a pipeline and web server to analyze marker gene amplicon sequencing data automatically and consistently.

BMC Bioinformatics. 2022 Jan 6;23(1):27. doi: 10.1186/s12859-021-04555-0.

A comparison of sequencing platforms and bioinformatics pipelines for compositional analysis of the gut microbiome.

BMC Microbiol. 2017 Sep 13;17(1):194. doi: 10.1186/s12866-017-1101-8.

Processing a 16S rRNA Sequencing Dataset with the Microbiome Helper Workflow.

Methods Mol Biol. 2018;1849:131-141. doi: 10.1007/978-1-4939-8728-3_9.

Tourmaline: A containerized workflow for rapid and iterable amplicon sequence analysis using QIIME 2 and Snakemake.

Gigascience. 2022 Jul 28;11. doi: 10.1093/gigascience/giac066.

BIOCOM-PIPE: a new user-friendly metabarcoding pipeline for the characterization of microbial diversity from 16S, 18S and 23S rRNA gene amplicons.

BMC Bioinformatics. 2020 Oct 31;21(1):492. doi: 10.1186/s12859-020-03829-3.

引用本文的文献

Long-Term Drought Persistently Shifts Plant and Soil Microbial Communities but Has Limited Impact on CO Fluxes Under Subsequent Drought.

Glob Chang Biol. 2025 Sep;31(9):e70441. doi: 10.1111/gcb.70441.

Soil microbial communities are more disrupted by extreme drought than by gradual climate shifts under different land-use intensities.

Front Microbiol. 2025 Aug 7;16:1649443. doi: 10.3389/fmicb.2025.1649443. eCollection 2025.

Uncovering the Role of Land Use Intensity in Shaping Forest and Grassland-Specific Soil Fungal Communities.

Environ Microbiol. 2025 Aug;27(8):e70170. doi: 10.1111/1462-2920.70170.

Dynamics of zoosporic parasites in summer phytoplankton communities of the Baltic Sea.

FEMS Microbiol Ecol. 2025 Jul 14;101(8). doi: 10.1093/femsec/fiaf081.

Integrated multi-omics of feces, plasma and urine can describe and differentiate pediatric active Crohn's Disease from remission.

Commun Med (Lond). 2025 Jul 8;5(1):281. doi: 10.1038/s43856-025-00984-7.

Functional diversity of soil macrofauna may contribute to microbial community stabilization under drought stress.

Front Microbiol. 2025 Jun 13;16:1597272. doi: 10.3389/fmicb.2025.1597272. eCollection 2025.

Severe drought impacts tree traits and associated soil microbial communities of clonal oaks.

Environ Microbiome. 2025 Jun 6;20(1):63. doi: 10.1186/s40793-025-00720-7.

Illuminating ecology and distribution of the rare fungus Phellinidium pouzarii in the Bavarian Forest National Park.

Sci Rep. 2025 Mar 12;15(1):8604. doi: 10.1038/s41598-025-91672-y.

Invertebrate Decline Has Minimal Effects on Oak-Associated Microbiomes.

Environ Microbiol. 2025 Feb;27(2):e70051. doi: 10.1111/1462-2920.70051.

Bark beetle infestation alters mycobiomes in wood, litter, and soil associated with Norway spruce.

FEMS Microbiol Ecol. 2025 Feb 20;101(3). doi: 10.1093/femsec/fiaf015.

本文引用的文献

PEMA: a flexible Pipeline for Environmental DNA Metabarcoding Analysis of the 16S/18S ribosomal RNA, ITS, and COI marker genes.

Gigascience. 2020 Mar 1;9(3). doi: 10.1093/gigascience/giaa022.

Bacterial and Eukaryotic Small-Subunit Amplicon Data Do Not Provide a Quantitative Picture of Microbial Communities, but They Are Reliable in the Context of Ecological Interpretations.

mSphere. 2020 Mar 4;5(2):e00052-20. doi: 10.1128/mSphere.00052-20.

Microbial resolution of whole genome shotgun and 16S amplicon metagenomic sequencing using publicly available NEON data.

PLoS One. 2020 Feb 13;15(2):e0228899. doi: 10.1371/journal.pone.0228899. eCollection 2020.

Comparing bioinformatic pipelines for microbial 16S rRNA amplicon sequencing.

PLoS One. 2020 Jan 16;15(1):e0227434. doi: 10.1371/journal.pone.0227434. eCollection 2020.

Evaluation of 16S rRNA gene sequencing for species and strain-level microbiome analysis.

Nat Commun. 2019 Nov 6;10(1):5029. doi: 10.1038/s41467-019-13036-1.

Comparative analysis of amplicon and metagenomic sequencing methods reveals key features in the evolution of animal metaorganisms.

Microbiome. 2019 Sep 14;7(1):133. doi: 10.1186/s40168-019-0743-1.

Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2.

Nat Biotechnol. 2019 Aug;37(8):852-857. doi: 10.1038/s41587-019-0209-9.

High-throughput amplicon sequencing of the full-length 16S rRNA gene with single-nucleotide resolution.

Nucleic Acids Res. 2019 Oct 10;47(18):e103. doi: 10.1093/nar/gkz569.

Performance of Microbiome Sequence Inference Methods in Environments with Varying Biomass.

mSystems. 2019 Feb 19;4(1). doi: 10.1128/mSystems.00163-18. eCollection 2019 Jan-Feb.

NanoAmpli-Seq: a workflow for amplicon sequencing for mixed microbial communities on the nanopore sequencing platform.

Gigascience. 2018 Dec 1;7(12):giy140. doi: 10.1093/gigascience/giy140.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

dada 蛇，一个 DADA2 的 Snakemake 实现，用于处理微生物生态学的扩增子测序数据。

Dadasnake, a Snakemake implementation of DADA2 to process amplicon sequencing data for microbial ecology.

机构信息

Helmholtz Centre for Environmental Research GmbH - UFZ, Department of Soil Ecology; Theodor-Lieser-Str. 4, 06120 Halle, Germany.

German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Metagenomics Support Unit; Puschstr. 4, 04103 Leipzig, Germany.