微生物组序列推断方法在不同生物量环境中的性能

Performance of Microbiome Sequence Inference Methods in Environments with Varying Biomass.

作者信息

Caruso Vincent, Song Xubo, Asquith Mark, Karstens Lisa

机构信息

Division of Bioinformatics and Computational Biology, Oregon Health and Science University, Portland, Oregon, USA.

Center for Spoken Language Understanding, Oregon Health and Science University, Portland, Oregon, USA.

出版信息

mSystems. 2019 Feb 19;4(1). doi: 10.1128/mSystems.00163-18. eCollection 2019 Jan-Feb.

DOI:10.1128/mSystems.00163-18

PMID:30801029

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6381225/

Abstract

Microbiome community composition plays an important role in human health, and while most research to date has focused on high-microbial-biomass communities, low-biomass communities are also important. However, contamination and technical noise make determining the true community signal difficult when biomass levels are low, and the influence of varying biomass on sequence processing methods has received little attention. Here, we benchmarked six methods that infer community composition from 16S rRNA sequence reads, using samples of varying biomass. We included two operational taxonomic unit (OTU) clustering algorithms, one entropy-based method, and three more-recent amplicon sequence variant (ASV) methods. We first compared inference results from high-biomass mock communities to assess baseline performance. We then benchmarked the methods on a dilution series made from a single mock community-samples that varied only in biomass. ASVs/OTUs inferred by each method were classified as representing expected community, technical noise, or contamination. With the high-biomass data, we found that the ASV methods had good sensitivity and precision, whereas the other methods suffered in one area or in both. Inferred contamination was present only in small proportions. With the dilution series, contamination represented an increasing proportion of the data from the inferred communities, regardless of the inference method used. However, correlation between inferred contaminants and sample biomass was strongest for the ASV methods and weakest for the OTU methods. Thus, no inference method on its own can distinguish true community sequences from contaminant sequences, but ASV methods provide the most accurate characterization of community and contaminants. Microbial communities have important ramifications for human health, but determining their impact requires accurate characterization. Current technology makes microbiome sequence data more accessible than ever. However, popular software methods for analyzing these data are based on algorithms developed alongside older sequencing technology and smaller data sets and thus may not be adequate for modern, high-throughput data sets. Additionally, samples from environments where microbes are scarce present additional challenges to community characterization relative to high-biomass environments, an issue that is often ignored. We found that a new class of microbiome sequence processing tools, called amplicon sequence variant (ASV) methods, outperformed conventional methods. In samples representing low-biomass communities, where sample contamination becomes a significant confounding factor, the improved accuracy of ASV methods may allow more-robust computational identification of contaminants.

摘要

微生物群落组成在人类健康中起着重要作用，虽然迄今为止大多数研究都集中在高微生物生物量群落上，但低生物量群落也很重要。然而，当生物量水平较低时，污染和技术噪声使得确定真正的群落信号变得困难，并且不同生物量对序列处理方法的影响很少受到关注。在这里，我们使用不同生物量的样本对六种从16S rRNA序列读数推断群落组成的方法进行了基准测试。我们纳入了两种操作分类单元（OTU）聚类算法、一种基于熵的方法和三种更新的扩增子序列变体（ASV）方法。我们首先比较了高生物量模拟群落的推断结果，以评估基线性能。然后，我们在由单个模拟群落制成的稀释系列上对这些方法进行了基准测试，这些样本仅在生物量上有所不同。每种方法推断出的ASV/OTU被分类为代表预期群落、技术噪声或污染。对于高生物量数据，我们发现ASV方法具有良好的灵敏度和精度，而其他方法在一个或两个方面表现不佳。推断出的污染仅占小比例。对于稀释系列，无论使用何种推断方法，污染在推断群落的数据中所占比例都在增加。然而，对于ASV方法，推断出的污染物与样本生物量之间的相关性最强，而对于OTU方法则最弱。因此，没有一种推断方法能够单独将真正的群落序列与污染物序列区分开来，但ASV方法能够最准确地表征群落和污染物。微生物群落对人类健康有重要影响，但确定它们的影响需要准确的表征。当前技术使微生物组序列数据比以往任何时候都更容易获取。然而，用于分析这些数据的流行软件方法是基于与旧测序技术和较小数据集一起开发的算法，因此可能不足以处理现代的高通量数据集。此外，相对于高生物量环境，来自微生物稀缺环境的样本在群落表征方面带来了额外的挑战，而这个问题常常被忽视。我们发现，一类新的微生物组序列处理工具，即扩增子序列变体（ASV）方法，优于传统方法。在代表低生物量群落的样本中，样本污染成为一个重要的混杂因素，ASV方法提高的准确性可能允许对污染物进行更稳健的计算识别。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1c3e/6381225/babaa22acecd/mSystems.00163-18-f0001.jpg

相似文献

Performance of Microbiome Sequence Inference Methods in Environments with Varying Biomass.微生物组序列推断方法在不同生物量环境中的性能

mSystems. 2019 Feb 19;4(1). doi: 10.1128/mSystems.00163-18. eCollection 2019 Jan-Feb.

Controlling for Contaminants in Low-Biomass 16S rRNA Gene Sequencing Experiments.在低生物量16S rRNA基因测序实验中控制污染物

mSystems. 2019 Jun 4;4(4):e00290-19. doi: 10.1128/mSystems.00290-19.

Ecological Observations Based on Functional Gene Sequencing Are Sensitive to the Amplicon Processing Method.基于功能基因测序的生态观测对扩增子处理方法敏感。

mSphere. 2022 Aug 31;7(4):e0032422. doi: 10.1128/msphere.00324-22. Epub 2022 Aug 8.

PanFP: pangenome-based functional profiles for microbial communities.PanFP：基于全基因组的微生物群落功能概况

BMC Res Notes. 2015 Sep 26;8:479. doi: 10.1186/s13104-015-1462-8.

Daring to be differential: metabarcoding analysis of soil and plant-related microbial communities using amplicon sequence variants and operational taxonomical units.勇于差异化：利用扩增子序列变体和操作分类单元对土壤及植物相关微生物群落进行宏条形码分析

BMC Genomics. 2020 Oct 22;21(1):733. doi: 10.1186/s12864-020-07126-4.

LotuS2: an ultrafast and highly accurate tool for amplicon sequencing analysis.LotuS2：一种用于扩增子测序分析的超快速、高度准确的工具。

Microbiome. 2022 Oct 19;10(1):176. doi: 10.1186/s40168-022-01365-1.

From reads to operational taxonomic units: an ensemble processing pipeline for MiSeq amplicon sequencing data.从读取到可操作分类单元：用于MiSeq扩增子测序数据的集成处理流程

Gigascience. 2017 Feb 1;6(2):1-10. doi: 10.1093/gigascience/giw017.

Handling of spurious sequences affects the outcome of high-throughput 16S rRNA gene amplicon profiling.假序列的处理会影响高通量16S rRNA基因扩增子分析的结果。

ISME Commun. 2021 Jun 29;1(1):31. doi: 10.1038/s43705-021-00033-z.

KatharoSeq Enables High-Throughput Microbiome Analysis from Low-Biomass Samples.KatharoSeq技术可实现对低生物量样本的高通量微生物组分析。

mSystems. 2018 Mar 13;3(3). doi: 10.1128/mSystems.00218-17. eCollection 2018 May-Jun.

OptiClust, an Improved Method for Assigning Amplicon-Based Sequence Data to Operational Taxonomic Units.OptiClust，一种将基于扩增子的序列数据分配到操作分类单元的改进方法。

mSphere. 2017 Mar 8;2(2). doi: 10.1128/mSphereDirect.00073-17. eCollection 2017 Mar-Apr.

引用本文的文献

Investigating fungal diversity through metabarcoding for environmental samples: assessment of ITS1 and ITS2 Illumina sequencing using multiple defined mock communities with different classification methods and reference databases.通过宏条形码技术研究环境样本中的真菌多样性：使用多种定义的模拟群落、不同分类方法和参考数据库评估ITS1和ITS2的Illumina测序

BMC Genomics. 2025 Aug 6;26(1):729. doi: 10.1186/s12864-025-11917-y.

The unresolved struggle of 16S rRNA amplicon sequencing: a benchmarking analysis of clustering and denoising methods.16S rRNA扩增子测序的未解决难题：聚类和去噪方法的基准分析

Environ Microbiome. 2025 May 13;20(1):51. doi: 10.1186/s40793-025-00705-6.

Plasmid Backbone Impacts Conjugation Rate, Transconjugant Fitness, and Community Assembly of Genetically Bioaugmented Soil Microbes for PAH Bioremediation.质粒骨架影响用于多环芳烃生物修复的基因增强型土壤微生物的接合率、接合子适应性及群落组装。

ACS Environ Au. 2025 Jan 22;5(2):241-252. doi: 10.1021/acsenvironau.4c00123. eCollection 2025 Mar 19.

Anthropogenic reverberations on the gut microbiome of dwarf chameleons ().人类活动对侏儒变色龙肠道微生物群的影响（）。（注：原文括号部分内容缺失，翻译只能到此程度）

PeerJ. 2025 Feb 28;13:e18811. doi: 10.7717/peerj.18811. eCollection 2025.

Comparing subsampling strategies for metagenomic analysis in microbial studies using amplicon sequence variants versus operational taxonomic units.在微生物研究中，比较使用扩增子序列变体与操作分类单元进行宏基因组分析的二次抽样策略。

PLoS One. 2024 Dec 30;19(12):e0315720. doi: 10.1371/journal.pone.0315720. eCollection 2024.

Dietary Energy Sources Affect Cecal and Fecal Microbiota of Healthy Horses.日粮能量来源影响健康马匹的盲肠和粪便微生物群。

Animals (Basel). 2024 Dec 3;14(23):3494. doi: 10.3390/ani14233494.

ASV vs OTUs clustering: Effects on alpha, beta, and gamma diversities in microbiome metabarcoding studies.ASV 与 OTUs 聚类：对宏基因组 metabarcoding 研究中 alpha、beta 和 gamma 多样性的影响。

PLoS One. 2024 Oct 3;19(10):e0309065. doi: 10.1371/journal.pone.0309065. eCollection 2024.

The salivary microbiome as a diagnostic biomarker of periodontitis: a 16S multi-batch study before and after the removal of batch effects.唾液微生物组作为牙周炎的诊断生物标志物：去除批次效应前后的 16S 多批次研究。

Front Cell Infect Microbiol. 2024 Jul 12;14:1405699. doi: 10.3389/fcimb.2024.1405699. eCollection 2024.

Existence of rare actinobacterial forms in the Indian sector of Southern Ocean: 16 S rRNA based metabarcoding study.南海印度海域稀有放线菌的存在：基于 16S rRNA 的代谢组条形码研究。

Braz J Microbiol. 2024 Sep;55(3):2363-2370. doi: 10.1007/s42770-024-01424-9. Epub 2024 Jul 11.

Synthesis of current pediatric urinary microbiome research.当前儿科泌尿微生物组研究综述

Front Pediatr. 2024 Jun 18;12:1396408. doi: 10.3389/fped.2024.1396408. eCollection 2024.

本文引用的文献

Simple statistical identification and removal of contaminant sequences in marker-gene and metagenomics data.简单的统计鉴定和去除标记基因和宏基因组数据中的污染物序列。

Microbiome. 2018 Dec 17;6(1):226. doi: 10.1186/s40168-018-0605-2.

Denoising the Denoisers: an independent evaluation of microbiome sequence error-correction approaches.对去噪器进行去噪：微生物组序列错误校正方法的独立评估。

PeerJ. 2018 Aug 8;6:e5364. doi: 10.7717/peerj.5364. eCollection 2018.

KatharoSeq Enables High-Throughput Microbiome Analysis from Low-Biomass Samples.KatharoSeq技术可实现对低生物量样本的高通量微生物组分析。

mSystems. 2018 Mar 13;3(3). doi: 10.1128/mSystems.00218-17. eCollection 2018 May-Jun.

Exact sequence variants should replace operational taxonomic units in marker-gene data analysis.在标记基因数据分析中，精确序列变体应取代操作分类单元。

ISME J. 2017 Dec;11(12):2639-2643. doi: 10.1038/ismej.2017.119. Epub 2017 Jul 21.

Deblur Rapidly Resolves Single-Nucleotide Community Sequence Patterns.Deblur能快速解析单核苷酸群落序列模式。

mSystems. 2017 Mar 7;2(2). doi: 10.1128/mSystems.00191-16. eCollection 2017 Mar-Apr.

Improved Bacterial 16S rRNA Gene (V4 and V4-5) and Fungal Internal Transcribed Spacer Marker Gene Primers for Microbial Community Surveys.用于微生物群落调查的改良细菌16S rRNA基因（V4和V4-5）及真菌内转录间隔区标记基因引物

mSystems. 2015 Dec 22;1(1). doi: 10.1128/mSystems.00009-15. eCollection 2016 Jan-Feb.

Open-Source Sequence Clustering Methods Improve the State Of the Art.开源序列聚类方法提升了现有技术水平。

mSystems. 2016 Feb 9;1(1). doi: 10.1128/mSystems.00003-15. eCollection 2016 Jan-Feb.

VSEARCH: a versatile open source tool for metagenomics.VSEARCH：一款用于宏基因组学的多功能开源工具。

PeerJ. 2016 Oct 18;4:e2584. doi: 10.7717/peerj.2584. eCollection 2016.

Does the Urinary Microbiome Play a Role in Urgency Urinary Incontinence and Its Severity?泌尿微生物群在急迫性尿失禁及其严重程度中起作用吗？

Front Cell Infect Microbiol. 2016 Jul 27;6:78. doi: 10.3389/fcimb.2016.00078. eCollection 2016.

Inherent bacterial DNA contamination of extraction and sequencing reagents may affect interpretation of microbiota in low bacterial biomass samples.提取和测序试剂中固有的细菌DNA污染可能会影响低细菌生物量样本中微生物群的解读。

Gut Pathog. 2016 May 26;8:24. doi: 10.1186/s13099-016-0103-7. eCollection 2016.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

微生物组序列推断方法在不同生物量环境中的性能

Performance of Microbiome Sequence Inference Methods in Environments with Varying Biomass.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献