利用 QIIME 2 的 q2-feature-classifier 插件优化标记基因扩增子序列的分类学分类。

Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2's q2-feature-classifier plugin.

机构信息

The Pathogen and Microbiome Institute, Northern Arizona University, PO Box 4073, Flagstaff, AZ, 86011-4073, USA.

Research School of Biology, Australian National University, 46 Sullivans Creek Road, Acton ACT, 2601, Australia.

出版信息

Microbiome. 2018 May 17;6(1):90. doi: 10.1186/s40168-018-0470-z.

DOI:10.1186/s40168-018-0470-z

PMID:29773078

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5956843/

Abstract

BACKGROUND

Taxonomic classification of marker-gene sequences is an important step in microbiome analysis.

RESULTS

We present q2-feature-classifier ( https://github.com/qiime2/q2-feature-classifier ), a QIIME 2 plugin containing several novel machine-learning and alignment-based methods for taxonomy classification. We evaluated and optimized several commonly used classification methods implemented in QIIME 1 (RDP, BLAST, UCLUST, and SortMeRNA) and several new methods implemented in QIIME 2 (a scikit-learn naive Bayes machine-learning classifier, and alignment-based taxonomy consensus methods based on VSEARCH, and BLAST+) for classification of bacterial 16S rRNA and fungal ITS marker-gene amplicon sequence data. The naive-Bayes, BLAST+-based, and VSEARCH-based classifiers implemented in QIIME 2 meet or exceed the species-level accuracy of other commonly used methods designed for classification of marker gene sequences that were evaluated in this work. These evaluations, based on 19 mock communities and error-free sequence simulations, including classification of simulated "novel" marker-gene sequences, are available in our extensible benchmarking framework, tax-credit ( https://github.com/caporaso-lab/tax-credit-data ).

CONCLUSIONS

Our results illustrate the importance of parameter tuning for optimizing classifier performance, and we make recommendations regarding parameter choices for these classifiers under a range of standard operating conditions. q2-feature-classifier and tax-credit are both free, open-source, BSD-licensed packages available on GitHub.

摘要

背景

标记基因序列的分类学分类是微生物组分析的重要步骤。

结果

我们提出了 q2-feature-classifier（https://github.com/qiime2/q2-feature-classifier），这是一个 QIIME 2 插件，包含几种新的机器学习和基于比对的方法，用于分类学分类。我们评估和优化了几种在 QIIME 1 中常用的分类方法（RDP、BLAST、UCLUST 和 SortMeRNA），以及几种在 QIIME 2 中实现的新方法（基于 scikit-learn 的朴素贝叶斯机器学习分类器，以及基于 VSEARCH 和 BLAST+的基于比对的分类共识方法），用于细菌 16S rRNA 和真菌 ITS 标记基因扩增子序列数据的分类。在 QIIME 2 中实现的朴素贝叶斯、BLAST+-和 VSEARCH-基于的分类器在本工作中评估的用于分类标记基因序列的其他常用方法的种水平准确性方面达到或超过了其他常用方法。这些评估基于 19 个模拟群落和无错误序列模拟，包括对模拟“新”标记基因序列的分类，可在我们的可扩展基准测试框架 tax-credit（https://github.com/caporaso-lab/tax-credit-data）中获得。

结论

我们的结果说明了优化分类器性能时参数调整的重要性，并且我们针对这些分类器在一系列标准操作条件下的参数选择提出了建议。q2-feature-classifier 和 tax-credit 都是免费的、开源的、BSD 许可的软件包，可在 GitHub 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d84c/5956843/bbf841a528c3/40168_2018_470_Fig1_HTML.jpg

相似文献

Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2's q2-feature-classifier plugin.

Microbiome. 2018 May 17;6(1):90. doi: 10.1186/s40168-018-0470-z.

Ultrafast and accurate 16S rRNA microbial community analysis using Kraken 2.

Microbiome. 2020 Aug 28;8(1):124. doi: 10.1186/s40168-020-00900-2.

A Bayesian taxonomic classification method for 16S rRNA gene sequences with improved species-level accuracy.

BMC Bioinformatics. 2017 May 10;18(1):247. doi: 10.1186/s12859-017-1670-4.

Mycofier: a new machine learning-based classifier for fungal ITS sequences.

BMC Res Notes. 2016 Aug 11;9(1):402. doi: 10.1186/s13104-016-2203-3.

IDTAXA: a novel approach for accurate taxonomic classification of microbiome sequences.

Microbiome. 2018 Aug 9;6(1):140. doi: 10.1186/s40168-018-0521-5.

Construction & assessment of a unified curated reference database for improving the taxonomic classification of bacteria using 16S rRNA sequence data.

Indian J Med Res. 2020 Jan;151(1):93-103. doi: 10.4103/ijmr.IJMR_220_18.

Evaluating the accuracy of amplicon-based microbiome computational pipelines on simulated human gut microbial communities.

BMC Bioinformatics. 2017 May 30;18(1):283. doi: 10.1186/s12859-017-1690-0.

CONSTAX: a tool for improved taxonomic resolution of environmental fungal ITS sequences.

BMC Bioinformatics. 2017 Dec 6;18(1):538. doi: 10.1186/s12859-017-1952-x.

Ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses.

Microbiome. 2016 Feb 24;4:11. doi: 10.1186/s40168-016-0153-6.

bioOTU: An Improved Method for Simultaneous Taxonomic Assignments and Operational Taxonomic Units Clustering of 16s rRNA Gene Sequences.

J Comput Biol. 2016 Apr;23(4):229-38. doi: 10.1089/cmb.2015.0214. Epub 2016 Mar 7.

引用本文的文献

Insights into the Composition and Function of Virus Communities During Acetic Acid Fermentation of Shanxi Aged Vinegar.

Foods. 2025 Sep 3;14(17):3095. doi: 10.3390/foods14173095.

Microbial living materials promote coral larval settlement.

PNAS Nexus. 2025 Sep 9;4(9):pgaf268. doi: 10.1093/pnasnexus/pgaf268. eCollection 2025 Sep.

Maternal β-carotene addition has long-term effects on intestinal health of offspring chicks.

Front Microbiol. 2025 Aug 21;16:1623816. doi: 10.3389/fmicb.2025.1623816. eCollection 2025.

Dynamic male mouse gut microbiota signature linked to improved wound healing of a novel salecan hydrogel dressing.

Front Bioeng Biotechnol. 2025 Aug 21;13:1584976. doi: 10.3389/fbioe.2025.1584976. eCollection 2025.

The Impact of Low-Lactose, High Galacto-Oligosaccharides Milk on Gut Microbiome and Plasma Metabolome in Healthy Adults: A Randomized, Double-Blind, Controlled Clinical Trial Complemented by Ex Vivo Experiments.

Curr Dev Nutr. 2025 Jul 24;9(9):107506. doi: 10.1016/j.cdnut.2025.107506. eCollection 2025 Sep.

Gut microbiota variability in dung beetles: prokaryotes vary according to the phylogeny of the host species while fungi vary according to the diet.

Front Insect Sci. 2025 Aug 20;5:1639013. doi: 10.3389/finsc.2025.1639013. eCollection 2025.

Changes in environmental conditions regulate the biodiversity of planktonic microeukaryotes mediated by the dispersal-selection relationships in river: an example of the Beipan River, Guizhou, China.

Front Microbiol. 2025 Aug 19;16:1649800. doi: 10.3389/fmicb.2025.1649800. eCollection 2025.

Comparison of iSeq and Miseq in 16S rRNA sequencing-based human gut microbiome analysis.

bioRxiv. 2025 Aug 22:2025.08.22.671784. doi: 10.1101/2025.08.22.671784.

In vitro fermentation characteristics of dietary fibers using fecal inocula from dogs treated with metronidazole.

Anim Microbiome. 2025 Sep 1;7(1):93. doi: 10.1186/s42523-025-00459-z.

Differential effects of switching to integrase strand transfer inhibitors on the gut microbiota and markers of HIV disease progression.

BMC Microbiol. 2025 Sep 1;25(1):569. doi: 10.1186/s12866-025-04313-9.

本文引用的文献

A communal catalogue reveals Earth's multiscale microbial diversity.

Nature. 2017 Nov 23;551(7681):457-463. doi: 10.1038/nature24621. Epub 2017 Nov 1.

Critical Assessment of Metagenome Interpretation-a benchmark of metagenomics software.

Nat Methods. 2017 Nov;14(11):1063-1071. doi: 10.1038/nmeth.4458. Epub 2017 Oct 2.

Synthetic spike-in standards for high-throughput 16S rRNA gene amplicon sequencing.

Nucleic Acids Res. 2017 Feb 28;45(4):e23. doi: 10.1093/nar/gkw984.

mockrobiota: a Public Resource for Microbiome Bioinformatics Benchmarking.

mSystems. 2016 Oct 18;1(5). doi: 10.1128/mSystems.00062-16. eCollection 2016 Sep-Oct.

VSEARCH: a versatile open source tool for metagenomics.

PeerJ. 2016 Oct 18;4:e2584. doi: 10.7717/peerj.2584. eCollection 2016.

Accurate Estimation of Fungal Diversity and Abundance through Improved Lineage-Specific Primers Optimized for Illumina Amplicon Sequencing.

Appl Environ Microbiol. 2016 Nov 21;82(24):7217-7226. doi: 10.1128/AEM.02576-16. Print 2016 Dec 15.

Systematic improvement of amplicon marker gene methods for increased accuracy in microbiome studies.

Nat Biotechnol. 2016 Sep;34(9):942-9. doi: 10.1038/nbt.3601. Epub 2016 Jul 25.

DADA2: High-resolution sample inference from Illumina amplicon data.

Nat Methods. 2016 Jul;13(7):581-3. doi: 10.1038/nmeth.3869. Epub 2016 May 23.

Fungal identification using a Bayesian classifier and the Warcup training set of internal transcribed spacer sequences.

Mycologia. 2016 Jan-Feb;108(1):1-5. doi: 10.3852/14-293. Epub 2015 Nov 9.

16S classifier: a tool for fast and accurate taxonomic classification of 16S rRNA hypervariable regions in metagenomic datasets.

PLoS One. 2015 Feb 3;10(2):e0116106. doi: 10.1371/journal.pone.0116106. eCollection 2015.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用 QIIME 2 的 q2-feature-classifier 插件优化标记基因扩增子序列的分类学分类。

Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2's q2-feature-classifier plugin.

机构信息

The Pathogen and Microbiome Institute, Northern Arizona University, PO Box 4073, Flagstaff, AZ, 86011-4073, USA.

Research School of Biology, Australian National University, 46 Sullivans Creek Road, Acton ACT, 2601, Australia.