APSCALE：用于 DNA 代谢组学数据简单而全面分析的高级流水线。

APSCALE: advanced pipeline for simple yet comprehensive analyses of DNA metabarcoding data.

机构信息

University of Duisburg-Essen, Faculty of Biology, Aquatic Ecosystem Research, Essen 45141, Germany.

Univeresity of Duisburg-Essen, Centre for Water and Environmental Research (ZWU), Essen 45141, Germany.

出版信息

Bioinformatics. 2022 Oct 14;38(20):4817-4819. doi: 10.1093/bioinformatics/btac588.

DOI:10.1093/bioinformatics/btac588

PMID:36029248

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9563694/

Abstract

SUMMARY

DNA metabarcoding is an emerging approach to assess and monitor biodiversity worldwide and consequently the number and size of data sets increases exponentially. To date, no published DNA metabarcoding data processing pipeline exists that is (i) platform independent, (ii) easy to use [incl. graphical user interface (GUI)], (iii) fast (does scale well with dataset size) and (iv) complies with data protection regulations of e.g. environmental agencies. The presented pipeline APSCALE meets these requirements and handles the most common tasks of sequence data processing, such as paired-end merging, primer trimming, quality filtering, clustering and denoising of any popular metabarcoding marker, such as internal transcribed spacer, 16S or cytochrome c oxidase subunit I. APSCALE comes in a command line and a GUI version. The latter provides the user with additional summary statistics options and links to GUI-based downstream applications.

AVAILABILITY AND IMPLEMENTATION

APSCALE is written in Python, a platform-independent language, and integrates functions of the open-source tools, VSEARCH (Rognes et al., 2016), cutadapt (Martin, 2011) and LULU (Frøslev et al., 2017). All modules support multithreading to allow fast processing of larger DNA metabarcoding datasets. Further information and troubleshooting are provided on the respective GitHub pages for the command-line version (https://github.com/DominikBuchner/apscale) and the GUI-based version (https://github.com/TillMacher/apscale_gui), including a detailed tutorial.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

DNA 条码技术是一种新兴的方法，用于评估和监测全球生物多样性，因此，数据集的数量和规模呈指数级增长。迄今为止，还没有发布一个（i）与平台无关，（ii）易于使用[包括图形用户界面（GUI）]，（iii）快速（可与数据集大小很好地扩展）且（iv）符合环境机构等数据保护法规的 DNA 条码数据处理管道。所提出的 APSCALE 管道满足这些要求，并处理序列数据处理的最常见任务，例如配对末端合并、引物修剪、质量过滤、聚类和任何流行的条码标记（如内部转录间隔区、16S 或细胞色素 c 氧化酶亚基 I）的去噪。APSCALE 有命令行和 GUI 版本。后者为用户提供了附加的摘要统计选项，并链接到基于 GUI 的下游应用程序。

可用性和实施

APSCALE 是用 Python 编写的，这是一种与平台无关的语言，并集成了开源工具 VSEARCH（Rognes 等人，2016 年）、cutadapt（Martin，2011 年）和 LULU（Frøslev 等人，2017 年）的功能。所有模块都支持多线程，以允许快速处理更大的 DNA 条码数据集。有关命令行版本（https://github.com/DominikBuchner/apscale）和基于 GUI 的版本（https://github.com/TillMacher/apscale_gui）的更多信息和故障排除都在相应的 GitHub 页面上提供，包括详细的教程。

补充信息

补充数据可在 Bioinformatics 在线获取。

相似文献

APSCALE: advanced pipeline for simple yet comprehensive analyses of DNA metabarcoding data.APSCALE：用于 DNA 代谢组学数据简单而全面分析的高级流水线。

Bioinformatics. 2022 Oct 14;38(20):4817-4819. doi: 10.1093/bioinformatics/btac588.

TaxonTableTools: A comprehensive, platform-independent graphical user interface software to explore and visualise DNA metabarcoding data.分类单元表工具：一个全面、与平台无关的图形用户界面软件，用于探索和可视化 DNA metabarcoding 数据。

Mol Ecol Resour. 2021 Jul;21(5):1705-1714. doi: 10.1111/1755-0998.13358. Epub 2021 Mar 11.

SLIM: a flexible web application for the reproducible processing of environmental DNA metabarcoding data.SLIM：一个灵活的网络应用程序，用于可重复处理环境 DNA metabarcoding 数据。

BMC Bioinformatics. 2019 Feb 19;20(1):88. doi: 10.1186/s12859-019-2663-2.

AXIOME3: Automation, eXtension, and Integration Of Microbial Ecology.AXIOME3：微生物生态学的自动化、扩展和集成。

Gigascience. 2021 Feb 3;10(2). doi: 10.1093/gigascience/giab006.

User-friendly bioinformatics pipeline gDAT (graphical downstream analysis tool) for analysing rDNA sequences.用户友好型生物信息学流程 gDAT（图形下游分析工具），用于分析 rDNA 序列。

Mol Ecol Resour. 2021 May;21(4):1380-1392. doi: 10.1111/1755-0998.13340. Epub 2021 Feb 12.

MLDSP-GUI: an alignment-free standalone tool with an interactive graphical user interface for DNA sequence comparison and analysis.MLDSP-GUI：一个无比对的独立工具，带有交互式图形用户界面，用于 DNA 序列比较和分析。

Bioinformatics. 2020 Apr 1;36(7):2258-2259. doi: 10.1093/bioinformatics/btz918.

DNA metabarcoding and the cytochrome c oxidase subunit I marker: not a perfect match.DNA 宏条形码与细胞色素 c 氧化酶亚基 I 标记：并非完美匹配。

Biol Lett. 2014 Sep;10(9). doi: 10.1098/rsbl.2014.0562.

NeuroPycon: An open-source python toolbox for fast multi-modal and reproducible brain connectivity pipelines.NeuroPycon：一个开源的 Python 工具包，用于快速进行多模态和可重复的脑连接管道。

Neuroimage. 2020 Oct 1;219:117020. doi: 10.1016/j.neuroimage.2020.117020. Epub 2020 Jun 6.

Bioinformatic pipelines combining denoising and clustering tools allow for more comprehensive prokaryotic and eukaryotic metabarcoding.生物信息学管道结合去噪和聚类工具，可实现更全面的原核生物和真核生物代谢组学分析。

Mol Ecol Resour. 2021 Aug;21(6):1904-1921. doi: 10.1111/1755-0998.13398. Epub 2021 Apr 27.

Just keep it simple? Benchmarking the accuracy of taxonomy assignment software in metabarcoding studies.保持简单？元条形码研究中分类学分配软件准确性的基准测试。

Mol Ecol Resour. 2021 Oct;21(7):2187-2189. doi: 10.1111/1755-0998.13473. Epub 2021 Aug 3.

引用本文的文献

Authenticity in bakery products: Detection of pistachio fraud using NGS metabarcoding.烘焙食品的真实性：使用二代测序宏条形码技术检测开心果掺假

Food Chem (Oxf). 2025 May 5;10:100260. doi: 10.1016/j.fochms.2025.100260. eCollection 2025 Jun.

Metacommunity Theory and Metabarcoding Reveal the Environmental, Spatial and Biotic Drivers of Meiofaunal Communities in Sandy Beaches.元群落理论与宏条形码技术揭示了沙滩小型底栖动物群落的环境、空间和生物驱动因素。

Mol Ecol. 2025 Apr;34(8):e17733. doi: 10.1111/mec.17733. Epub 2025 Mar 20.

Benthic Feeding and Diet Partitioning in Red Sea Mesopelagic Fish Resolved Through DNA Metabarcoding and ROV Footage.通过DNA宏条形码技术和遥控潜水器影像解析红海中层鱼类的底栖摄食与食性划分

Ecol Evol. 2025 Mar 6;15(3):e71091. doi: 10.1002/ece3.71091. eCollection 2025 Mar.

Is it worth the extra mile? Comparing environmental DNA and RNA metabarcoding for vertebrate and invertebrate biodiversity surveys in a lowland stream.是否值得多走一英里？在低地溪流中进行脊椎动物和无脊椎动物生物多样性调查时，比较环境 DNA 和 RNA 代谢组学的优劣。

PeerJ. 2024 Oct 24;12:e18016. doi: 10.7717/peerj.18016. eCollection 2024.

Upscaling biodiversity monitoring: Metabarcoding estimates 31,846 insect species from Malaise traps across Germany.扩大生物多样性监测范围：通过代谢条形码技术估算德国各地马氏网诱捕到的31846种昆虫。

Mol Ecol Resour. 2025 Jan;25(1):e14023. doi: 10.1111/1755-0998.14023. Epub 2024 Oct 4.

Primed and ready: nanopore metabarcoding can now recover highly accurate consensus barcodes that are generally indel-free.准备就绪：纳米孔代谢组条形码现在可以恢复高度准确的共识条形码，通常无插入/缺失。

BMC Genomics. 2024 Sep 9;25(1):842. doi: 10.1186/s12864-024-10767-4.

Synchronised monitoring of plant and insect diversity: a case study using automated Malaise traps and DNA-based methods.植物与昆虫多样性的同步监测：基于自动马氏网诱捕器和DNA技术的案例研究

Biodivers Data J. 2024 Jul 30;12:e127669. doi: 10.3897/BDJ.12.e127669. eCollection 2024.

Unlocking rivers' hidden diversity and ecological status using DNA metabarcoding in Northwest Spain.利用DNA宏条形码技术揭示西班牙西北部河流隐藏的多样性和生态状况

Ecol Evol. 2024 Aug 1;14(8):e70110. doi: 10.1002/ece3.70110. eCollection 2024 Aug.

Establishing Silphids in the invertebrate DNA toolbox: a proof of concept.建立无脊椎动物 DNA 工具包中的丝氨酸：概念验证。

PeerJ. 2024 Jul 8;12:e17636. doi: 10.7717/peerj.17636. eCollection 2024.

Predicting environmental stressor levels with machine learning: a comparison between amplicon sequencing, metagenomics, and total RNA sequencing based on taxonomically assigned data.利用机器学习预测环境应激源水平：基于分类学分配数据对扩增子测序、宏基因组学和总RNA测序的比较

Front Microbiol. 2023 Nov 24;14:1217750. doi: 10.3389/fmicb.2023.1217750. eCollection 2023.

本文引用的文献

Standardized high-throughput biomonitoring using DNA metabarcoding: Strategies for the adoption of automated liquid handlers.使用DNA宏条形码的标准化高通量生物监测：采用自动化液体处理仪的策略

Environ Sci Ecotechnol. 2021 Aug 30;8:100122. doi: 10.1016/j.ese.2021.100122. eCollection 2021 Oct.

Mol Ecol Resour. 2021 Jul;21(5):1705-1714. doi: 10.1111/1755-0998.13358. Epub 2021 Mar 11.

Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2.使用QIIME 2进行可重复、交互式、可扩展和可延伸的微生物组数据科学研究。

Nat Biotechnol. 2019 Aug;37(8):852-857. doi: 10.1038/s41587-019-0209-9.

BMC Bioinformatics. 2019 Feb 19;20(1):88. doi: 10.1186/s12859-019-2663-2.

Algorithm for post-clustering curation of DNA amplicon data yields reliable biodiversity estimates.基于聚类后分析的 DNA 扩增子数据校正算法可得出可靠的生物多样性估计值。

Nat Commun. 2017 Oct 30;8(1):1188. doi: 10.1038/s41467-017-01312-x.

VSEARCH: a versatile open source tool for metagenomics.VSEARCH：一款用于宏基因组学的多功能开源工具。

PeerJ. 2016 Oct 18;4:e2584. doi: 10.7717/peerj.2584. eCollection 2016.

DADA2: High-resolution sample inference from Illumina amplicon data.DADA2：从Illumina扩增子数据进行高分辨率样本推断。

Nat Methods. 2016 Jul;13(7):581-3. doi: 10.1038/nmeth.3869. Epub 2016 May 23.

Search and clustering orders of magnitude faster than BLAST.比 BLAST 快几个数量级的搜索和聚类。

Bioinformatics. 2010 Oct 1;26(19):2460-1. doi: 10.1093/bioinformatics/btq461. Epub 2010 Aug 12.

BLAST+: architecture and applications.BLAST+：体系结构与应用。

BMC Bioinformatics. 2009 Dec 15;10:421. doi: 10.1186/1471-2105-10-421.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验