Suppr超能文献

通过用户适应的生物信息学管道和参数进行微生物组描绘。

Microbiome depiction through user-adapted bioinformatic pipelines and parameters.

机构信息

Department of Human and Molecular Genetics, Herbert Wertheim College of Medicine, Florida International University, Miami, FL, USA.

Bioinformatics Research Group (BioRG), Knight Foundation School of Computing and Information Sciences, Florida International University, Miami, FL, USA.

出版信息

J Med Microbiol. 2023 Oct;72(10). doi: 10.1099/jmm.0.001756.

Abstract

The role of the microbiome in health and disease continues to be increasingly recognized. However, there is significant variability in the bioinformatic protocols for analysing genomic data. This, in part, has impeded the potential incorporation of microbiomics into the clinical setting and has challenged interstudy reproducibility. In microbial compositional analysis, there is a growing recognition for the need to move away from a one-size-fits-all approach to data processing. Few evidence-based recommendations exist for setting parameters of programs that infer microbiota community profiles despite these parameters significantly impacting the accuracy of taxonomic inference. To compare three commonly used programs (DADA2, QIIME2, and mothur) and optimize them into four user-adapted pipelines for processing paired-end amplicon reads. We aim to increase the accuracy of compositional inference and help standardize microbiomic protocol. Two key parameters were isolated across four pipelines: filtering sequence reads based on a whole-number error threshold (maxEE) and truncating read ends based on a quality score threshold (QTrim). Closeness of sample inference was then evaluated using a mock community of known composition. We observed that raw genomic data lost were proportionate to how stringently parameters were set. Exactly how much data were lost varied by pipeline. Accuracy of sample inference correlated with increased sequence read retention. Falsely detected taxa and unaccounted for microbial constituents were unique to pipeline and parameter. Implementation of optimized parameter values led to better approximation of the known mock community. Microbial compositions generated based on the 16S rRNA marker gene should be interpreted with caution. To improve microbial community profiling, bioinformatic protocols must be user-adapted. Analysis should be performed with consideration for the select target amplicon, pipelines and parameters used, and taxa of interest.

摘要

微生物组在健康和疾病中的作用正越来越受到重视。然而,用于分析基因组数据的生物信息学协议存在很大的可变性。这在一定程度上阻碍了微生物组学在临床环境中的潜在应用,并对研究间的可重复性提出了挑战。在微生物组成分析中,人们越来越认识到需要从一刀切的方法转变为数据处理方法。尽管这些参数对分类学推断的准确性有重大影响,但对于推断微生物群落特征的程序参数的设定,几乎没有基于证据的建议。为了比较三种常用的程序(DADA2、QIIME2 和 mothur),并将它们优化为四个适用于处理成对扩增子读取的用户适应管道。我们旨在提高组成推断的准确性,并帮助标准化微生物组学协议。在四个管道中分离出两个关键参数:基于整数错误阈值(maxEE)过滤序列读取和基于质量得分阈值(QTrim)截断读取末端。然后使用已知组成的模拟群落评估样品推断的接近程度。我们观察到,原始基因组数据的丢失与参数设置的严格程度成正比。丢失的确切数据量因管道而异。样品推断的准确性与序列读取保留率的增加相关。错误检测的分类单元和未被发现的微生物成分是管道和参数特有的。优化参数值的实施导致对已知模拟群落的更好近似。基于 16S rRNA 标记基因生成的微生物组成应谨慎解释。为了改善微生物群落分析,生物信息学协议必须适应用户。分析应考虑选择的目标扩增子、使用的管道和参数以及感兴趣的分类单元。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验