文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

元基因组分类器的综合基准测试和集成方法。

Comprehensive benchmarking and ensemble approaches for metagenomic classifiers.

机构信息

Tri-Institutional Program in Computational Biology and Medicine, New York, NY, USA.

Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, 10021, USA.

出版信息

Genome Biol. 2017 Sep 21;18(1):182. doi: 10.1186/s13059-017-1299-7.


DOI:10.1186/s13059-017-1299-7
PMID:28934964
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5609029/
Abstract

BACKGROUND: One of the main challenges in metagenomics is the identification of microorganisms in clinical and environmental samples. While an extensive and heterogeneous set of computational tools is available to classify microorganisms using whole-genome shotgun sequencing data, comprehensive comparisons of these methods are limited. RESULTS: In this study, we use the largest-to-date set of laboratory-generated and simulated controls across 846 species to evaluate the performance of 11 metagenomic classifiers. Tools were characterized on the basis of their ability to identify taxa at the genus, species, and strain levels, quantify relative abundances of taxa, and classify individual reads to the species level. Strikingly, the number of species identified by the 11 tools can differ by over three orders of magnitude on the same datasets. Various strategies can ameliorate taxonomic misclassification, including abundance filtering, ensemble approaches, and tool intersection. Nevertheless, these strategies were often insufficient to completely eliminate false positives from environmental samples, which are especially important where they concern medically relevant species. Overall, pairing tools with different classification strategies (k-mer, alignment, marker) can combine their respective advantages. CONCLUSIONS: This study provides positive and negative controls, titrated standards, and a guide for selecting tools for metagenomic analyses by comparing ranges of precision, accuracy, and recall. We show that proper experimental design and analysis parameters can reduce false positives, provide greater resolution of species in complex metagenomic samples, and improve the interpretation of results.

摘要

背景:宏基因组学面临的主要挑战之一是鉴定临床和环境样本中的微生物。虽然有大量异构的计算工具可用于使用全基因组鸟枪法测序数据对微生物进行分类,但这些方法的综合比较有限。

结果:在这项研究中,我们使用了最大的实验室生成和模拟对照数据集,涵盖了 846 个物种,以评估 11 种宏基因组分类器的性能。这些工具的特点是基于它们在属、种和菌株水平上识别分类群的能力、量化分类群的相对丰度以及将单个读取分类到物种水平的能力。引人注目的是,在相同的数据集上,11 种工具识别的物种数量可以相差三个数量级以上。各种策略可以改善分类错误,包括丰度过滤、集成方法和工具交叉。然而,这些策略往往不足以完全消除环境样本中的假阳性,这在涉及医学相关物种时尤为重要。总体而言,将具有不同分类策略(k-mer、比对、标记)的工具配对可以结合它们各自的优势。

结论:本研究通过比较精度、准确性和召回率的范围,为宏基因组分析提供了阳性和阴性对照、滴定标准以及选择工具的指南。我们表明,适当的实验设计和分析参数可以减少假阳性,提高复杂宏基因组样本中物种的分辨率,并改善结果的解释。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63f3/5609029/25fa020cb101/13059_2017_1299_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63f3/5609029/8b07d9d1c32e/13059_2017_1299_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63f3/5609029/fccafbdf0e60/13059_2017_1299_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63f3/5609029/7ee4075b8c10/13059_2017_1299_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63f3/5609029/382645103edf/13059_2017_1299_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63f3/5609029/f96eafe88fbf/13059_2017_1299_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63f3/5609029/92a2ccabaa0e/13059_2017_1299_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63f3/5609029/25fa020cb101/13059_2017_1299_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63f3/5609029/8b07d9d1c32e/13059_2017_1299_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63f3/5609029/fccafbdf0e60/13059_2017_1299_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63f3/5609029/7ee4075b8c10/13059_2017_1299_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63f3/5609029/382645103edf/13059_2017_1299_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63f3/5609029/f96eafe88fbf/13059_2017_1299_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63f3/5609029/92a2ccabaa0e/13059_2017_1299_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63f3/5609029/25fa020cb101/13059_2017_1299_Fig7_HTML.jpg

相似文献

[1]
Comprehensive benchmarking and ensemble approaches for metagenomic classifiers.

Genome Biol. 2017-9-21

[2]
Evaluation of taxonomic classification and profiling methods for long-read shotgun metagenomic sequencing datasets.

BMC Bioinformatics. 2022-12-13

[3]
Selection of marker genes for genetic barcoding of microorganisms and binning of metagenomic reads by Barcoder software tools.

BMC Bioinformatics. 2018-8-30

[4]
Benchmarking Metagenomics Tools for Taxonomic Classification.

Cell. 2019-8-8

[5]
Gauge your phage: benchmarking of bacteriophage identification tools in metagenomic sequencing data.

Microbiome. 2023-4-21

[6]
Unbiased Taxonomic Annotation of Metagenomic Samples.

J Comput Biol. 2018-3

[7]
CAIM: coverage-based analysis for identification of microbiome.

Brief Bioinform. 2024-7-25

[8]
k-SLAM: accurate and ultra-fast taxonomic classification and gene identification for large metagenomic data sets.

Nucleic Acids Res. 2017-2-28

[9]
Crowdsourced benchmarking of taxonomic metagenome profilers: lessons learned from the sbv IMPROVER Microbiomics challenge.

BMC Genomics. 2022-8-30

[10]
Qmatey: an automated pipeline for fast exact matching-based alignment and strain-level taxonomic binning and profiling of metagenomes.

Brief Bioinform. 2023-9-22

引用本文的文献

[1]
The Role of Gut Microbiota in Gastrointestinal Immune Homeostasis and Inflammation: Implications for Inflammatory Bowel Disease.

Biomedicines. 2025-7-24

[2]
Investigating fungal diversity through metabarcoding for environmental samples: assessment of ITS1 and ITS2 Illumina sequencing using multiple defined mock communities with different classification methods and reference databases.

BMC Genomics. 2025-8-6

[3]
Advancing metagenomic classification with NABAS+: a novel alignment-based approach.

NAR Genom Bioinform. 2025-7-4

[4]
Precise and scalable metagenomic profiling with sample-tailored minimizer libraries.

NAR Genom Bioinform. 2025-6-9

[5]
Bioinformatic approaches to blood and tissue microbiome analyses: challenges and perspectives.

Brief Bioinform. 2025-3-4

[6]
Enhancing nucleotide sequence representations in genomic analysis with contrastive optimization.

Commun Biol. 2025-3-29

[7]
Addressing the dynamic nature of reference data: a new nucleotide database for robust metagenomic classification.

mSystems. 2025-4-22

[8]
Revisiting the cancer microbiome using PRISM.

bioRxiv. 2025-1-24

[9]
Impact of simulation and reference catalogues on the evaluation of taxonomic profiling pipelines.

Microb Genom. 2025-1

[10]
The Naïve Bayes classifier++ for metagenomic taxonomic classification-query evaluation.

Bioinformatics. 2024-12-26

本文引用的文献

[1]
Critical Assessment of Metagenome Interpretation-a benchmark of metagenomics software.

Nat Methods. 2017-11

[2]
Large-scale differences in microbial biodiversity discovery between 16S amplicon and shotgun sequencing.

Sci Rep. 2017-7-31

[3]
International Standards for Genomes, Transcriptomes, and Metagenomes.

J Biomol Tech. 2017-4

[4]
Genomic Methods and Microbiological Technologies for Profiling Novel and Extreme Environments for the Extreme Microbiome Project (XMP).

J Biomol Tech. 2017-4

[5]
Unexplored Archaeal Diversity in the Great Ape Gut Microbiome.

mSphere. 2017-2-22

[6]
Scaffolding and completing genome assemblies in real-time with nanopore sequencing.

Nat Commun. 2017-2-20

[7]
Maturation of the infant microbiome community structure and function across multiple body sites and in relation to mode of delivery.

Nat Med. 2017-3

[8]
Microbial Community Patterns Associated with Automated Teller Machine Keypads in New York City.

mSphere. 2016-11-16

[9]
Avoiding Pandemic Fears in the Subway and Conquering the Platypus.

mSystems. 2016-6-28

[10]
Urban Transit System Microbial Communities Differ by Surface Type and Interaction with Humans and the Environment.

mSystems. 2016-6-28

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索