Suppr超能文献

氮循环微生物多样性与操作分类单元聚类:何时优先考虑准确性而非速度。

Nitrogen Cycling Microbial Diversity and Operational Taxonomic Unit Clustering: When to Prioritize Accuracy Over Speed.

作者信息

Egenriether Sada, Sanford Robert, Yang Wendy H, Kent Angela D

机构信息

Program in Ecology, Evolution and Conservation Biology, University of Illinois at Urbana-Champaign, Urbana, IL, United States.

Department of Geology, University of Illinois at Urbana-Champaign, Urbana, IL, United States.

出版信息

Front Microbiol. 2022 May 26;13:730340. doi: 10.3389/fmicb.2022.730340. eCollection 2022.

Abstract

BACKGROUND

Assessments of the soil microbiome provide valuable insight to ecosystem function due to the integral role microorganisms play in biogeochemical cycling of carbon and nutrients. For example, treatment effects on nitrogen cycling functional groups are often presented alongside one another to demonstrate how agricultural management practices affect various nitrogen cycling processes. However, the functional groups commonly evaluated in nitrogen cycling microbiome studies range from phylogenetically narrow (e.g., N-fixation, nitrification) to broad [e.g., denitrification, dissimilatory nitrate reduction to ammonium (DNRA)]. The bioinformatics methods used in such studies were developed for 16S rRNA gene sequence data, and how these tools perform across functional genes of different phylogenetic diversity has not been established. For example, an OTU clustering method that can accurately characterize sequences harboring comparatively little diversity may not accurately resolve the diversity within a gene comprised of a large number of clades. This study uses two nitrogen cycling genes, , a gene which segregates into only three distinct clades, and , a gene which is comprised of at least eighteen clades, to investigate differences which may arise when using heuristic OTU clustering (abundance-based greedy clustering, AGC) vs. true hierarchical OTU clustering (Matthews Correlation Coefficient optimizing algorithm, Opti-MCC). Detection of treatment differences for each gene were evaluated to demonstrate how conclusions drawn from a given dataset may differ depending on clustering method used.

RESULTS

The heuristic and hierarchical methods performed comparably for the more conserved gene, . The hierarchical method outperformed the heuristic method for the more diverse gene, ; this included both the ability to detect treatment differences using PERMANOVA, as well as higher resolution in taxonomic classification. The difference in performance between the two methods may be traced to the AGC method's preferential assignment of sequences to the most abundant OTUs: when analysis was limited to only the largest 100 OTUs, results from the AGC-assembled OTU table more closely resembled those of the Opti-MCC OTU table. Additionally, both AGC and Opti-MCC OTU tables detected comparable treatment differences using the rank-based ANOSIM test. This demonstrates that treatment differences were preserved using both clustering methods but were structured differently within the OTU tables produced using each method.

CONCLUSION

For questions which can be answered using tests agnostic to clustering method (e.g., ANOSIM), or for genes of relatively low phylogenetic diversity (e.g., ), most upstream processing methods should lead to similar conclusions from downstream analyses. For studies involving more diverse genes, however, care should be exercised to choose methods that ensure accurate clustering for all genes. This will mitigate the risk of introducing Type II errors by allowing for detection of comparable treatment differences for all genes assessed, rather than disproportionately detecting treatment differences in only low-diversity genes.

摘要

背景

由于微生物在碳和养分的生物地球化学循环中发挥着不可或缺的作用,对土壤微生物群落的评估为生态系统功能提供了有价值的见解。例如,对氮循环功能组的处理效果通常会同时呈现,以展示农业管理实践如何影响各种氮循环过程。然而,氮循环微生物群落研究中通常评估的功能组范围从系统发育上较窄的(例如,固氮、硝化作用)到较宽泛的(例如,反硝化作用、异化硝酸盐还原为铵(DNRA))。此类研究中使用的生物信息学方法是针对16S rRNA基因序列数据开发的,而这些工具在不同系统发育多样性的功能基因上的表现尚未确定。例如,一种能够准确表征多样性相对较低的序列的OTU聚类方法,可能无法准确解析由大量进化枝组成的基因内的多样性。本研究使用两个氮循环基因,一个仅分为三个不同进化枝的基因,以及一个由至少十八个进化枝组成的基因,来研究使用启发式OTU聚类(基于丰度的贪婪聚类,AGC)与真正的层次OTU聚类(马修斯相关系数优化算法,Opti-MCC)时可能出现的差异。评估每个基因的处理差异检测,以证明根据所使用的聚类方法,从给定数据集中得出的结论可能会有所不同。

结果

对于保守性较高的基因,启发式方法和层次方法的表现相当。对于多样性较高的基因,层次方法优于启发式方法;这包括使用PERMANOVA检测处理差异的能力,以及在分类学分类上更高的分辨率。两种方法在性能上的差异可能归因于AGC方法将序列优先分配到最丰富的OTU:当分析仅限于最大的100个OTU时,AGC组装的OTU表的结果与Opti-MCC OTU表的结果更相似。此外,AGC和Opti-MCC OTU表使用基于秩的ANOSIM检验检测到了相当的处理差异。这表明使用两种聚类方法都保留了处理差异,但在使用每种方法生成的OTU表中结构不同。

结论

对于可以使用与聚类方法无关的检验(例如,ANOSIM)回答的问题,或者对于系统发育多样性相对较低的基因(例如),大多数上游处理方法应能从下游分析中得出相似的结论。然而,对于涉及更多样化基因的研究,应谨慎选择确保所有基因都能准确聚类的方法。这将通过允许检测所有评估基因的可比处理差异,而不是仅在低多样性基因中不成比例地检测处理差异,来降低引入II型错误的风险。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc11/9201982/c45055d17fbc/fmicb-13-730340-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验