Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, USA.
mSphere. 2021 Aug 25;6(4):e0019121. doi: 10.1128/mSphere.00191-21. Epub 2021 Jul 21.
Amplicon sequencing variants (ASVs) have been proposed as an alternative to operational taxonomic units (OTUs) for analyzing microbial communities. ASVs have grown in popularity, in part because of a desire to reflect a more refined level of taxonomy since they do not cluster sequences based on a distance-based threshold. However, ASVs and the use of overly narrow thresholds to identify OTUs increase the risk of splitting a single genome into separate clusters. To assess this risk, I analyzed the intragenomic variation of 16S rRNA genes from the bacterial genomes represented in an copy number database, which contained 20,427 genomes from 5,972 species. As the number of copies of the 16S rRNA gene increased in a genome, the number of ASVs also increased. There was an average of 0.58 ASVs per copy of the 16S rRNA gene for full-length 16S rRNA genes. It was necessary to use a distance threshold of 5.25% to cluster full-length ASVs from the same genome into a single OTU with 95% confidence for genomes with 7 copies of the 16S rRNA, such as Escherichia coli. This research highlights the risk of splitting a single bacterial genome into separate clusters when ASVs are used to analyze 16S rRNA gene sequence data. Although there is also a risk of clustering ASVs from different species into the same OTU when using broad distance thresholds, these risks are of less concern than artificially splitting a genome into separate ASVs and OTUs. 16S rRNA gene sequencing has engendered significant interest in studying microbial communities. There has been tension between trying to classify 16S rRNA gene sequences to increasingly lower taxonomic levels and the reality that those levels were defined using more sequence and physiological information than is available from a fragment of the 16S rRNA gene. Furthermore, the naming of bacterial taxa reflects the biases of those who name them. One motivation for the recent push to adopt ASVs in place of OTUs in microbial community analyses is to allow researchers to perform their analyses at the finest possible level that reflects species-level taxonomy. The current research is significant because it quantifies the risk of artificially splitting bacterial genomes into separate clusters. Far from providing a better representation of bacterial taxonomy and biology, the ASV approach can lead to conflicting inferences about the ecology of different ASVs from the same genome.
扩增子测序变体 (ASVs) 已被提议作为分析微生物群落的操作分类单位 (OTUs) 的替代方法。ASVs 越来越受欢迎,部分原因是希望反映更精细的分类水平,因为它们不是根据基于距离的阈值对序列进行聚类。然而,ASVs 和使用过于狭窄的阈值来识别 OTUs 会增加将单个基因组分割成单独聚类的风险。为了评估这种风险,我分析了细菌基因组中 16S rRNA 基因的基因组内变异,这些基因组代表了一个拷贝数数据库,其中包含来自 5972 个物种的 20427 个基因组。随着基因组中 16S rRNA 基因拷贝数的增加,ASVs 的数量也随之增加。全长 16S rRNA 基因的每个 16S rRNA 基因拷贝的平均 ASVs 数为 0.58 个。对于包含 7 个 16S rRNA 基因拷贝的基因组,如大肠杆菌,需要使用 5.25%的距离阈值才能将来自同一基因组的全长 ASVs 聚类成具有 95%置信度的单个 OTU。这项研究强调了在使用 ASVs 分析 16S rRNA 基因序列数据时,将单个细菌基因组分割成单独聚类的风险。尽管使用广泛的距离阈值将来自不同物种的 ASVs 聚类到同一 OTU 中也存在风险,但与人为地将基因组分割成单独的 ASVs 和 OTUs 相比,这些风险不太令人担忧。16S rRNA 基因测序在研究微生物群落方面引起了极大的兴趣。在试图将 16S rRNA 基因序列分类到越来越低的分类水平与这些水平是使用比 16S rRNA 基因片段提供更多的序列和生理信息来定义的现实之间存在紧张关系。此外,细菌分类群的命名反映了命名者的偏见。最近推动在微生物群落分析中采用 ASVs 而不是 OTUs 的一个动机是,允许研究人员在尽可能精细的水平上进行分析,以反映物种水平的分类学。目前的研究具有重要意义,因为它量化了人为地将细菌基因组分割成单独聚类的风险。ASV 方法远非更好地代表细菌分类学和生物学,它会导致来自同一基因组的不同 ASVs 的生态推断产生冲突。
Appl Environ Microbiol. 2023-5-31
Appl Environ Microbiol. 2018-3-1
mSphere. 2021-2-24
Environ Microbiome. 2025-7-15
Front Microbiol. 2025-5-29
Environ Microbiome. 2025-6-6
NPJ Biofilms Microbiomes. 2025-5-29
Appl Environ Microbiol. 2025-5-21
Trends Microbiol. 2021-5
Nat Biotechnol. 2020-4-27
Bioinformatics. 2018-7-15
Appl Environ Microbiol. 2018-3-1
mSystems. 2017-3-7