Suppr超能文献

ASV 与 OTUs 聚类:对宏基因组 metabarcoding 研究中 alpha、beta 和 gamma 多样性的影响。

ASV vs OTUs clustering: Effects on alpha, beta, and gamma diversities in microbiome metabarcoding studies.

机构信息

Department of Agronomy, Animals, Food, Natural Resources and Environment DAFNAE, University of Padova, Padua, Italy.

出版信息

PLoS One. 2024 Oct 3;19(10):e0309065. doi: 10.1371/journal.pone.0309065. eCollection 2024.

Abstract

In microbial community sequencing, involving bacterial ribosomal 16S rDNA or fungal ITS, the targeted genes are the basis for taxonomical assignment. The traditional bioinformatical procedure has for decades made use of a clustering protocol by which sequences are pooled into packages of shared percent identity, typically at 97%, to yield Operational Technical Units (OTUs). Progress in the data processing methods has however led to the possibility of minimizing technical sequencers errors, which were the main reason for the OTU choice, and to analyze instead the exact Amplicon Sequence Variants (ASV) which is a choice yielding much less agglomerated reads. We have tested the two procedures on the same 16S metabarcoded bacterial amplicons dataset encompassing a series of samples from 17 adjacent habitats, taken across a 700 meter-long transect of different ecological conditions unfolding in a gradient spanning from cropland, through meadows, forest and all successional transitions up to the seashore, within the same coastal area. This design allowed to scan a high biodiversity basin and to measure alpha, beta and gamma diversity of the area, to verify the effect of the bioinformatics on the same data as concerns the values of ten different ecological indexes and other parameters. Two levels of progressive OTUs clustering, (99% and 97%) were compared with the ASV data. The results showed that the OTUs clustering proportionally led to a marked underestimation of the ecological indicators values for species diversity and to a distorted behaviour of the dominance and evenness indexes with respect to the direct use of the ASV data. Multivariate ordination analyses resulted also sensitive in terms of tree topology and coherence. Overall, data support the view that reference-based OTU clustering carries several misleading disadvantageous biases, including the risk of missing novel taxa which are yet unreferenced in databases. Since its alternatives as de novo clustering have on the other hand drawbacks due to heavier computational demand and results comparability, especially for environmental studies which contain several yet uncharacterized species, the direct ASV based analysis, at least for prokaryotes, appears to warrant significand advantages in comparison to OTU clustering at every level of percent identity cutoff.

摘要

在微生物群落测序中,涉及细菌核糖体 16S rDNA 或真菌 ITS,目标基因是分类学分配的基础。传统的生物信息学程序几十年来一直使用聚类协议,根据该协议,序列被汇集到具有共享百分比同一性的包中,通常为 97%,以产生操作技术单元(OTU)。然而,数据处理方法的进展使得最小化技术测序器错误成为可能,这些错误是选择 OTU 的主要原因,并且可以分析确切的扩增子序列变体(ASV),这是一种产生聚集读更少的选择。我们在同一个 16S 代谢条形码细菌扩增子数据集上测试了这两种方法,该数据集包含了一系列来自 17 个相邻栖息地的样本,这些样本取自跨越不同生态条件的 700 米长的横截,从农田到草地、森林和所有演替过渡到海滨,在同一沿海地区内。这种设计允许扫描高生物多样性流域,并测量该地区的 alpha、beta 和 gamma 多样性,验证生物信息学对同一数据的影响,以及关于十个不同生态指数和其他参数的值。将两种渐进式 OTUs 聚类水平(99%和 97%)与 ASV 数据进行了比较。结果表明,OTUs 聚类比例导致物种多样性的生态指标值显著低估,并导致优势和均匀度指数的行为扭曲,相对于直接使用 ASV 数据。多元排序分析在树拓扑和一致性方面也很敏感。总体而言,数据支持这样一种观点,即基于参考的 OTU 聚类存在多种误导性的不利偏见,包括错过尚未在数据库中引用的新分类群的风险。由于其替代方案(从头聚类)由于计算需求增加和结果可比性较差(尤其是对于包含多个尚未表征的物种的环境研究),因此在直接基于 ASV 的分析中,至少对于原核生物而言,与在每个百分比身份截止值水平上的 OTU 聚类相比,似乎具有显著优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f50/11449282/ef5ae163bc92/pone.0309065.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验