Suppr超能文献

分析分类注释的测序策略和工具:为逐步宏基因组学定义标准。

Analysis of sequencing strategies and tools for taxonomic annotation: Defining standards for progressive metagenomics.

机构信息

Consorcio de Investigación del Golfo de México (CIGOM), Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernvaca, Mexico.

Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernvaca, Mexico.

出版信息

Sci Rep. 2018 Aug 13;8(1):12034. doi: 10.1038/s41598-018-30515-5.

Abstract

Metagenomics research has recently thrived due to DNA sequencing technologies improvement, driving the emergence of new analysis tools and the growth of taxonomic databases. However, there is no all-purpose strategy that can guarantee the best result for a given project and there are several combinations of software, parameters and databases that can be tested. Therefore, we performed an impartial comparison, using statistical measures of classification for eight bioinformatic tools and four taxonomic databases, defining a benchmark framework to evaluate each tool in a standardized context. Using in silico simulated data for 16S rRNA amplicons and whole metagenome shotgun data, we compared the results from different software and database combinations to detect biases related to algorithms or database annotation. Using our benchmark framework, researchers can define cut-off values to evaluate the expected error rate and coverage for their results, regardless the score used by each software. A quick guide to select the best tool, all datasets and scripts to reproduce our results and benchmark any new method are available at https://github.com/Ales-ibt/Metagenomic-benchmark . Finally, we stress out the importance of gold standards, database curation and manual inspection of taxonomic profiling results, for a better and more accurate microbial diversity description.

摘要

宏基因组学研究最近因 DNA 测序技术的改进而蓬勃发展,推动了新的分析工具的出现和分类数据库的增长。然而,没有一种通用的策略可以保证给定项目的最佳结果,并且可以测试几种软件、参数和数据库的组合。因此,我们使用分类的统计度量对八个生物信息学工具和四个分类数据库进行了公正的比较,定义了一个基准框架,以便在标准化的环境中评估每个工具。我们使用 16S rRNA 扩增子和全宏基因组鸟枪法数据的计算机模拟数据,比较了来自不同软件和数据库组合的结果,以检测与算法或数据库注释相关的偏差。使用我们的基准框架,研究人员可以定义截止值来评估他们的结果的预期错误率和覆盖率,而不考虑每个软件使用的分数。在 https://github.com/Ales-ibt/Metagenomic-benchmark 上可以获得一个快速指南,用于选择最佳工具、所有数据集和脚本,以重现我们的结果并基准测试任何新方法。最后,我们强调了黄金标准、数据库管理和对分类分析结果的手动检查的重要性,以更好、更准确地描述微生物多样性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/61ed/6089906/7863bb0808e3/41598_2018_30515_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验