Suppr超能文献

对去噪器进行去噪:微生物组序列错误校正方法的独立评估。

Denoising the Denoisers: an independent evaluation of microbiome sequence error-correction approaches.

作者信息

Nearing Jacob T, Douglas Gavin M, Comeau André M, Langille Morgan G I

机构信息

Department of Microbiology and Immunology, Dalhousie University, Halifax, Nova Scotia, Canada.

Integrated Microbiome Resource, Dalhousie University, Halifax, Nova Scotia, Canada.

出版信息

PeerJ. 2018 Aug 8;6:e5364. doi: 10.7717/peerj.5364. eCollection 2018.

Abstract

High-depth sequencing of universal marker genes such as the 16S rRNA gene is a common strategy to profile microbial communities. Traditionally, sequence reads are clustered into operational taxonomic units (OTUs) at a defined identity threshold to avoid sequencing errors generating spurious taxonomic units. However, there have been numerous bioinformatic packages recently released that attempt to correct sequencing errors to determine real biological sequences at single nucleotide resolution by generating amplicon sequence variants (ASVs). As more researchers begin to use high resolution ASVs, there is a need for an in-depth and unbiased comparison of these novel "denoising" pipelines. In this study, we conduct a thorough comparison of three of the most widely-used denoising packages (DADA2, UNOISE3, and Deblur) as well as an open-reference 97% OTU clustering pipeline on mock, soil, and host-associated communities. We found from the mock community analyses that although they produced similar microbial compositions based on relative abundance, the approaches identified vastly different numbers of ASVs that significantly impact alpha diversity metrics. Our analysis on real datasets using recommended settings for each denoising pipeline also showed that the three packages were consistent in their per-sample compositions, resulting in only minor differences based on weighted UniFrac and Bray-Curtis dissimilarity. DADA2 tended to find more ASVs than the other two denoising pipelines when analyzing both the real soil data and two other host-associated datasets, suggesting that it could be better at finding rare organisms, but at the expense of possible false positives. The open-reference OTU clustering approach identified considerably more OTUs in comparison to the number of ASVs from the denoising pipelines in all datasets tested. The three denoising approaches were significantly different in their run times, with UNOISE3 running greater than 1,200 and 15 times faster than DADA2 and Deblur, respectively. Our findings indicate that, although all pipelines result in similar general community structure, the number of ASVs/OTUs and resulting alpha-diversity metrics varies considerably and should be considered when attempting to identify rare organisms from possible background noise.

摘要

对16S rRNA基因等通用标记基因进行高深度测序是剖析微生物群落的常用策略。传统上,序列读数会在定义的同一性阈值下聚类为操作分类单元(OTU),以避免测序错误产生虚假的分类单元。然而,最近发布了许多生物信息学软件包,它们试图通过生成扩增子序列变体(ASV)来校正测序错误,从而在单核苷酸分辨率下确定真实的生物学序列。随着越来越多的研究人员开始使用高分辨率的ASV,有必要对这些新颖的“去噪”流程进行深入且无偏的比较。在本研究中,我们对三个使用最广泛的去噪软件包(DADA2、UNOISE3和Deblur)以及一个开放参考的97% OTU聚类流程,在模拟群落、土壤群落和宿主相关群落上进行了全面比较。我们从模拟群落分析中发现,尽管基于相对丰度它们产生了相似的微生物组成,但这些方法识别出的ASV数量差异巨大,这对α多样性指标有显著影响。我们使用每个去噪流程的推荐设置对真实数据集进行的分析还表明,这三个软件包在每个样本的组成上是一致的,基于加权UniFrac和Bray-Curtis差异仅产生微小差异。在分析真实土壤数据和其他两个宿主相关数据集时,DADA2往往比其他两个去噪流程发现更多的ASV,这表明它可能更擅长发现稀有生物,但代价是可能出现假阳性。与所有测试数据集中去噪流程的ASV数量相比,开放参考OTU聚类方法识别出的OTU数量要多得多。这三种去噪方法在运行时间上有显著差异,UNOISE3的运行速度分别比DADA2和Deblur快1200倍以上和15倍。我们的研究结果表明,尽管所有流程都能产生相似的总体群落结构,但ASV/OTU的数量以及由此产生的α多样性指标差异很大,在试图从可能的背景噪声中识别稀有生物时应予以考虑。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8548/6087418/f04d1025e329/peerj-06-5364-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验