• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用混合物评估 16S rRNA 标记基因调查数据分析方法的框架。

A framework for assessing 16S rRNA marker-gene survey data analysis methods using mixtures.

机构信息

Biosystems and Biomaterials Division, National Institute of Standards and Technology, 100 Bureau Dr., Gaithersburg, 20899, MD, USA.

Center for Bioinformatics and Computational Biology, University of Maryland, College Park, 8314 Paint Branch Dr., College Park, 20742, MD, USA.

出版信息

Microbiome. 2020 Mar 13;8(1):35. doi: 10.1186/s40168-020-00812-1.

DOI:10.1186/s40168-020-00812-1
PMID:32169095
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7071580/
Abstract

BACKGROUND

There are a variety of bioinformatic pipelines and downstream analysis methods for analyzing 16S rRNA marker-gene surveys. However, appropriate assessment datasets and metrics are needed as there is limited guidance to decide between available analysis methods. Mixtures of environmental samples are useful for assessing analysis methods as one can evaluate methods based on calculated expected values using unmixed sample measurements and the mixture design. Previous studies have used mixtures of environmental samples to assess other sequencing methods such as RNAseq. But no studies have used mixtures of environmental to assess 16S rRNA sequencing.

RESULTS

We developed a framework for assessing 16S rRNA sequencing analysis methods which utilizes a novel two-sample titration mixture dataset and metrics to evaluate qualitative and quantitative characteristics of count tables. Our qualitative assessment evaluates feature presence/absence exploiting features only present in unmixed samples or titrations by testing if random sampling can account for their observed relative abundance. Our quantitative assessment evaluates feature relative and differential abundance by comparing observed and expected values. We demonstrated the framework by evaluating count tables generated with three commonly used bioinformatic pipelines: (i) DADA2 a sequence inference method, (ii) Mothur a de novo clustering method, and (iii) QIIME an open-reference clustering method. The qualitative assessment results indicated that the majority of Mothur and QIIME features only present in unmixed samples or titrations were accounted for by random sampling alone, but this was not the case for DADA2 features. Combined with count table sparsity (proportion of zero-valued cells in a count table), these results indicate DADA2 has a higher false-negative rate whereas Mothur and QIIME have higher false-positive rates. The quantitative assessment results indicated the observed relative abundance and differential abundance values were consistent with expected values for all three pipelines.

CONCLUSIONS

We developed a novel framework for assessing 16S rRNA marker-gene survey methods and demonstrated the framework by evaluating count tables generated with three bioinformatic pipelines. This framework is a valuable community resource for assessing 16S rRNA marker-gene survey bioinformatic methods and will help scientists identify appropriate analysis methods for their marker-gene surveys.

摘要

背景

分析 16S rRNA 标记基因调查有各种生物信息学管道和下游分析方法。然而,需要适当的评估数据集和指标,因为在决定可用分析方法时,指导有限。环境样本的混合物可用于评估分析方法,因为可以根据使用未混合样本测量值和混合物设计计算的预期值来评估方法。以前的研究已经使用环境样本混合物来评估其他测序方法,例如 RNAseq。但是,没有研究使用环境混合物来评估 16S rRNA 测序。

结果

我们开发了一种用于评估 16S rRNA 测序分析方法的框架,该框架利用新颖的两样本滴定混合物数据集和指标来评估计数表的定性和定量特征。我们的定性评估通过测试随机抽样是否可以解释其观察到的相对丰度,利用仅存在于未混合样本或滴定中的特征来评估特征的存在/不存在。我们的定量评估通过比较观察值和预期值来评估特征的相对丰度和差异丰度。我们通过评估三种常用生物信息学管道生成的计数表来演示该框架:(i)DADA2 一种序列推断方法,(ii)Mothur 一种从头聚类方法,和(iii)QIIME 一种开放参考聚类方法。定性评估结果表明,Mothur 和 QIIME 的大多数仅存在于未混合样本或滴定中的特征仅通过随机抽样就可以解释,但是 DADA2 特征则不然。结合计数表稀疏性(计数表中零值单元格的比例),这些结果表明 DADA2 的假阴性率较高,而 Mothur 和 QIIME 的假阳性率较高。定量评估结果表明,所有三个管道的观察相对丰度和差异丰度值与预期值一致。

结论

我们开发了一种评估 16S rRNA 标记基因调查方法的新框架,并通过评估三种生物信息学管道生成的计数表来演示该框架。该框架是评估 16S rRNA 标记基因调查生物信息学方法的有价值的社区资源,将帮助科学家为他们的标记基因调查确定适当的分析方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2fa/7071580/65446036f210/40168_2020_812_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2fa/7071580/45a4d62d54d8/40168_2020_812_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2fa/7071580/7c676116087d/40168_2020_812_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2fa/7071580/462f3db3a091/40168_2020_812_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2fa/7071580/1180510838d4/40168_2020_812_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2fa/7071580/5f8dcfd2736b/40168_2020_812_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2fa/7071580/f8aeefcfbef3/40168_2020_812_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2fa/7071580/15bee9b056ad/40168_2020_812_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2fa/7071580/65446036f210/40168_2020_812_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2fa/7071580/45a4d62d54d8/40168_2020_812_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2fa/7071580/7c676116087d/40168_2020_812_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2fa/7071580/462f3db3a091/40168_2020_812_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2fa/7071580/1180510838d4/40168_2020_812_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2fa/7071580/5f8dcfd2736b/40168_2020_812_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2fa/7071580/f8aeefcfbef3/40168_2020_812_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2fa/7071580/15bee9b056ad/40168_2020_812_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2fa/7071580/65446036f210/40168_2020_812_Fig8_HTML.jpg

相似文献

1
A framework for assessing 16S rRNA marker-gene survey data analysis methods using mixtures.使用混合物评估 16S rRNA 标记基因调查数据分析方法的框架。
Microbiome. 2020 Mar 13;8(1):35. doi: 10.1186/s40168-020-00812-1.
2
Comparing bioinformatic pipelines for microbial 16S rRNA amplicon sequencing.比较微生物 16S rRNA 扩增子测序的生物信息学分析流程。
PLoS One. 2020 Jan 16;15(1):e0227434. doi: 10.1371/journal.pone.0227434. eCollection 2020.
3
A comprehensive evaluation of the sl1p pipeline for 16S rRNA gene sequencing analysis.SL1p 管道用于 16S rRNA 基因测序分析的综合评估。
Microbiome. 2017 Aug 14;5(1):100. doi: 10.1186/s40168-017-0314-2.
4
Large-scale benchmarking reveals false discoveries and count transformation sensitivity in 16S rRNA gene amplicon data analysis methods used in microbiome studies.大规模基准测试揭示了微生物组研究中使用的 16S rRNA 基因扩增子数据分析方法中的假发现和计数转换敏感性。
Microbiome. 2016 Nov 25;4(1):62. doi: 10.1186/s40168-016-0208-8.
5
A comparison of sequencing platforms and bioinformatics pipelines for compositional analysis of the gut microbiome.用于肠道微生物组组成分析的测序平台和生物信息学管道的比较。
BMC Microbiol. 2017 Sep 13;17(1):194. doi: 10.1186/s12866-017-1101-8.
6
Ultrafast and accurate 16S rRNA microbial community analysis using Kraken 2.使用 Kraken 2 进行快速准确的 16S rRNA 微生物群落分析。
Microbiome. 2020 Aug 28;8(1):124. doi: 10.1186/s40168-020-00900-2.
7
Microbiome depiction through user-adapted bioinformatic pipelines and parameters.通过用户适应的生物信息学管道和参数进行微生物组描绘。
J Med Microbiol. 2023 Oct;72(10). doi: 10.1099/jmm.0.001756.
8
Galaxy mothur Toolset (GmT): a user-friendly application for 16S rRNA gene sequencing analysis using mothur.Galaxy mothur 工具集(GmT):一个使用 mothur 进行 16S rRNA 基因测序分析的用户友好型应用程序。
Gigascience. 2019 Feb 1;8(2). doi: 10.1093/gigascience/giy166.
9
Beware to ignore the rare: how imputing zero-values can improve the quality of 16S rRNA gene studies results.警惕忽视罕见情况:如何通过赋零值来提高 16S rRNA 基因研究结果的质量。
BMC Bioinformatics. 2022 Feb 7;22(Suppl 15):618. doi: 10.1186/s12859-022-04587-0.
10
Piphillin predicts metagenomic composition and dynamics from DADA2-corrected 16S rDNA sequences.Piphillin 可根据 DADA2 校正的 16S rDNA 序列预测宏基因组组成和动态。
BMC Genomics. 2020 Jan 17;21(1):56. doi: 10.1186/s12864-019-6427-1.

引用本文的文献

1
Influence of host phylogeny and water physicochemistry on microbial assemblages of the fish skin microbiome.宿主进化史和水理化性质对鱼类皮肤微生物组微生物群落的影响。
FEMS Microbiol Ecol. 2024 Feb 14;100(3). doi: 10.1093/femsec/fiae021.
2
Differential richness inference for 16S rRNA marker gene surveys.16S rRNA 标记基因调查的差异丰富度推断。
Genome Biol. 2022 Aug 1;23(1):166. doi: 10.1186/s13059-022-02722-x.

本文引用的文献

1
Absolute quantitation of microbiota abundance in environmental samples.绝对定量环境样本中的微生物群落丰度。
Microbiome. 2018 Jun 19;6(1):110. doi: 10.1186/s40168-018-0491-7.
2
Towards standards for human fecal sample processing in metagenomic studies.迈向宏基因组研究中人粪便样本处理的标准。
Nat Biotechnol. 2017 Nov;35(11):1069-1076. doi: 10.1038/nbt.3960. Epub 2017 Oct 2.
3
OptiClust, an Improved Method for Assigning Amplicon-Based Sequence Data to Operational Taxonomic Units.OptiClust,一种将基于扩增子的序列数据分配到操作分类单元的改进方法。
mSphere. 2017 Mar 8;2(2). doi: 10.1128/mSphereDirect.00073-17. eCollection 2017 Mar-Apr.
4
mockrobiota: a Public Resource for Microbiome Bioinformatics Benchmarking.微生物群落模拟物:微生物组生物信息学基准测试的公共资源。
mSystems. 2016 Oct 18;1(5). doi: 10.1128/mSystems.00062-16. eCollection 2016 Sep-Oct.
5
Improved Bacterial 16S rRNA Gene (V4 and V4-5) and Fungal Internal Transcribed Spacer Marker Gene Primers for Microbial Community Surveys.用于微生物群落调查的改良细菌16S rRNA基因(V4和V4-5)及真菌内转录间隔区标记基因引物
mSystems. 2015 Dec 22;1(1). doi: 10.1128/mSystems.00009-15. eCollection 2016 Jan-Feb.
6
Open-Source Sequence Clustering Methods Improve the State Of the Art.开源序列聚类方法提升了现有技术水平。
mSystems. 2016 Feb 9;1(1). doi: 10.1128/mSystems.00003-15. eCollection 2016 Jan-Feb.
7
Systematic improvement of amplicon marker gene methods for increased accuracy in microbiome studies.系统改进扩增子标记基因方法以提高微生物组研究的准确性。
Nat Biotechnol. 2016 Sep;34(9):942-9. doi: 10.1038/nbt.3601. Epub 2016 Jul 25.
8
Adjusting microbiome profiles for differences in microbial load by spike-in bacteria.通过添加细菌进行微生物负荷差异的微生物组谱调整。
Microbiome. 2016 Jun 21;4(1):28. doi: 10.1186/s40168-016-0175-0.
9
Individual-specific changes in the human gut microbiota after challenge with enterotoxigenic Escherichia coli and subsequent ciprofloxacin treatment.产肠毒素大肠杆菌攻击及随后环丙沙星治疗后人类肠道微生物群的个体特异性变化。
BMC Genomics. 2016 Jun 8;17:440. doi: 10.1186/s12864-016-2777-0.
10
DADA2: High-resolution sample inference from Illumina amplicon data.DADA2:从Illumina扩增子数据进行高分辨率样本推断。
Nat Methods. 2016 Jul;13(7):581-3. doi: 10.1038/nmeth.3869. Epub 2016 May 23.