Suppr超能文献

MG-RAST 版本 4-十年来低预算超高通量宏基因组分析的经验教训。

MG-RAST version 4-lessons learned from a decade of low-budget ultra-high-throughput metagenome analysis.

出版信息

Brief Bioinform. 2019 Jul 19;20(4):1151-1159. doi: 10.1093/bib/bbx105.

Abstract

As technologies change, MG-RAST is adapting. Newly available software is being included to improve accuracy and performance. As a computational service constantly running large volume scientific workflows, MG-RAST is the right location to perform benchmarking and implement algorithmic or platform improvements, in many cases involving trade-offs between specificity, sensitivity and run-time cost. The work in [Glass EM, Dribinsky Y, Yilmaz P, et al. ISME J 2014;8:1-3] is an example; we use existing well-studied data sets as gold standards representing different environments and different technologies to evaluate any changes to the pipeline. Currently, we use well-understood data sets in MG-RAST as platform for benchmarking. The use of artificial data sets for pipeline performance optimization has not added value, as these data sets are not presenting the same challenges as real-world data sets. In addition, the MG-RAST team welcomes suggestions for improvements of the workflow. We are currently working on versions 4.02 and 4.1, both of which contain significant input from the community and our partners that will enable double barcoding, stronger inferences supported by longer-read technologies, and will increase throughput while maintaining sensitivity by using Diamond and SortMeRNA. On the technical platform side, the MG-RAST team intends to support the Common Workflow Language as a standard to specify bioinformatics workflows, both to facilitate development and efficient high-performance implementation of the community's data analysis tasks.

摘要

随着技术的变化,MG-RAST 也在不断适应。新的可用软件正在被纳入,以提高准确性和性能。作为一个不断运行大量科学工作流程的计算服务,MG-RAST 是执行基准测试和实施算法或平台改进的理想场所,在许多情况下,这涉及到特异性、敏感性和运行时成本之间的权衡。[Glass EM、Dribinsky Y、Yilmaz P 等人,ISME J 2014;8:1-3]的工作就是一个例子;我们使用现有的、经过充分研究的数据集作为黄金标准,代表不同的环境和不同的技术,以评估管道的任何变化。目前,我们在 MG-RAST 中使用众所周知的数据集作为基准测试平台。使用人工数据集进行管道性能优化并没有增加价值,因为这些数据集不像真实数据集那样具有挑战性。此外,MG-RAST 团队欢迎对工作流程改进的建议。我们目前正在开发 4.02 和 4.1 两个版本,这两个版本都包含了来自社区和我们合作伙伴的重要意见,这将使双条形码、由长读技术支持的更强推断成为可能,并通过使用 Diamond 和 SortMeRNA 提高吞吐量,同时保持敏感性。在技术平台方面,MG-RAST 团队打算支持通用工作流语言作为指定生物信息学工作流程的标准,以促进社区数据分析任务的开发和高效的高性能实现。

相似文献

7
Practical evaluation of 11 de novo assemblers in metagenome assembly.宏基因组组装中11种从头组装程序的实际评估
J Microbiol Methods. 2018 Aug;151:99-105. doi: 10.1016/j.mimet.2018.06.007. Epub 2018 Jun 25.

引用本文的文献

5
Visualizing metagenomic and metatranscriptomic data: A comprehensive review.宏基因组学和宏转录组学数据的可视化:全面综述
Comput Struct Biotechnol J. 2024 May 3;23:2011-2033. doi: 10.1016/j.csbj.2024.04.060. eCollection 2024 Dec.
10

本文引用的文献

1
Singularity: Scientific containers for mobility of compute.奇点:用于计算移动性的科学容器。
PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017.
4
IMG/M: integrated genome and metagenome comparative data analysis system.IMG/M:综合基因组与宏基因组比较数据分析系统
Nucleic Acids Res. 2017 Jan 4;45(D1):D507-D516. doi: 10.1093/nar/gkw929. Epub 2016 Oct 13.
6
The MG-RAST metagenomics database and portal in 2015.2015年的MG-RAST宏基因组学数据库与门户网站。
Nucleic Acids Res. 2016 Jan 4;44(D1):D590-4. doi: 10.1093/nar/gkv1322. Epub 2015 Dec 9.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验