利用特定的微生物群落能够有效评估宏基因组组装。

Utilization of defined microbial communities enables effective evaluation of meta-genomic assemblies.

作者信息

Greenwald William W, Klitgord Niels, Seguritan Victor, Yooseph Shibu, Venter J Craig, Garner Chad, Nelson Karen E, Li Weizhong

机构信息

Bioinformatics and Systems Biology, University of California San Diego, La Jolla, CA, USA.

Human Longevity Inc, San Diego, CA, USA.

出版信息

BMC Genomics. 2017 Apr 13;18(1):296. doi: 10.1186/s12864-017-3679-5.

DOI:10.1186/s12864-017-3679-5

PMID:28407798

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5390407/

Abstract

BACKGROUND

Metagenomics is the study of the microbial genomes isolated from communities found on our bodies or in our environment. By correctly determining the relation between human health and the human associated microbial communities, novel mechanisms of health and disease can be found, thus enabling the development of novel diagnostics and therapeutics. Due to the diversity of the microbial communities, strategies developed for aligning human genomes cannot be utilized, and genomes of the microbial species in the community must be assembled de novo. However, in order to obtain the best metagenomic assemblies, it is important to choose the proper assembler. Due to the rapidly evolving nature of metagenomics, new assemblers are constantly created, and the field has not yet agreed on a standardized process. Furthermore, the truth sets used to compare these methods are either too simple (computationally derived diverse communities) or complex (microbial communities of unknown composition), yielding results that are hard to interpret. In this analysis, we interrogate the strengths and weaknesses of five popular assemblers through the use of defined biological samples of known genomic composition and abundance. We assessed the performance of each assembler on their ability to reassemble genomes, call taxonomic abundances, and recreate open reading frames (ORFs).

RESULTS

We tested five metagenomic assemblers: Omega, metaSPAdes, IDBA-UD, metaVelvet and MEGAHIT on known and synthetic metagenomic data sets. MetaSPAdes excelled in diverse sets, IDBA-UD performed well all around, metaVelvet had high accuracy in high abundance organisms, and MEGAHIT was able to accurately differentiate similar organisms within a community. At the ORF level, metaSPAdes and MEGAHIT had the least number of missing ORFs within diverse and similar communities respectively.

CONCLUSIONS

Depending on the metagenomics question asked, the correct assembler for the task at hand will differ. It is important to choose the appropriate assembler, and thus clearly define the biological problem of an experiment, as different assemblers will give different answers to the same question.

摘要

背景

宏基因组学是对从我们身体上或环境中发现的群落中分离出的微生物基因组进行的研究。通过正确确定人类健康与人类相关微生物群落之间的关系，可以发现健康和疾病的新机制，从而推动新型诊断方法和治疗方法的开发。由于微生物群落的多样性，无法采用为比对人类基因组而开发的策略，必须从头组装群落中微生物物种的基因组。然而，为了获得最佳的宏基因组组装结果，选择合适的组装程序很重要。由于宏基因组学的快速发展，新的组装程序不断涌现，该领域尚未就标准化流程达成共识。此外，用于比较这些方法的真值集要么过于简单（通过计算得出的多样群落），要么过于复杂（组成未知的微生物群落），得出的结果难以解释。在本分析中，我们通过使用已知基因组组成和丰度的特定生物样本，探究了五种常用组装程序的优缺点。我们评估了每个组装程序在重新组装基因组、确定分类丰度以及重建开放阅读框（ORF）方面的能力。

结果

我们在已知的和合成的宏基因组数据集上测试了五种宏基因组组装程序：Omega、metaSPAdes、IDBA-UD、metaVelvet和MEGAHIT。MetaSPAdes在多样数据集中表现出色，IDBA-UD整体表现良好，metaVelvet在高丰度生物体中具有较高的准确性，而MEGAHIT能够准确区分群落内的相似生物体。在ORF水平上，metaSPAdes和MEGAHIT分别在多样群落和相似群落中缺失的ORF数量最少。

结论

根据所提出的宏基因组学问题，适用于手头任务的正确组装程序会有所不同。选择合适的组装程序很重要，因此要明确界定实验的生物学问题，因为不同的组装程序对同一问题会给出不同的答案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cfc/5390407/6dd133112440/12864_2017_3679_Fig1_HTML.jpg

相似文献

Utilization of defined microbial communities enables effective evaluation of meta-genomic assemblies.

BMC Genomics. 2017 Apr 13;18(1):296. doi: 10.1186/s12864-017-3679-5.

Assessment of metagenomic assemblers based on hybrid reads of real and simulated metagenomic sequences.

Brief Bioinform. 2020 May 21;21(3):777-790. doi: 10.1093/bib/bbz025.

Fragmentation and Coverage Variation in Viral Metagenome Assemblies, and Their Effect in Diversity Calculations.

Front Bioeng Biotechnol. 2015 Sep 17;3:141. doi: 10.3389/fbioe.2015.00141. eCollection 2015.

MetaVelvet-SL: an extension of the Velvet assembler to a de novo metagenomic assembler utilizing supervised learning.

DNA Res. 2015 Feb;22(1):69-77. doi: 10.1093/dnares/dsu041. Epub 2014 Nov 27.

Omega: an overlap-graph de novo assembler for metagenomics.

Bioinformatics. 2014 Oct;30(19):2717-22. doi: 10.1093/bioinformatics/btu395. Epub 2014 Jun 19.

Meta-IDBA: a de Novo assembler for metagenomic data.

Bioinformatics. 2011 Jul 1;27(13):i94-101. doi: 10.1093/bioinformatics/btr216.

Practical evaluation of 11 de novo assemblers in metagenome assembly.

J Microbiol Methods. 2018 Aug;151:99-105. doi: 10.1016/j.mimet.2018.06.007. Epub 2018 Jun 25.

CAMISIM: simulating metagenomes and microbial communities.

Microbiome. 2019 Feb 8;7(1):17. doi: 10.1186/s40168-019-0633-6.

Comparison of de-novo assembly tools for plasmid metagenome analysis.

Genes Genomics. 2019 Sep;41(9):1077-1083. doi: 10.1007/s13258-019-00839-1. Epub 2019 Jun 11.

Hybrid metagenomic assembly enables high-resolution analysis of resistance determinants and mobile elements in human microbiomes.

Nat Biotechnol. 2019 Aug;37(8):937-944. doi: 10.1038/s41587-019-0191-2. Epub 2019 Jul 29.

引用本文的文献

Assessing the de novo assemblers: a metaviromic study of apple and first report of citrus concave gum-associated virus, apple rubbery wood virus 1 and 2 infecting apple in India.

BMC Genomics. 2024 Nov 8;25(1):1057. doi: 10.1186/s12864-024-10968-x.

Advances in engineering CRISPR-Cas9 as a molecular Swiss Army knife.

Synth Biol (Oxf). 2020 Oct 24;5(1):ysaa021. doi: 10.1093/synbio/ysaa021. eCollection 2020.

Accurate and complete genomes from metagenomes.

Genome Res. 2020 Mar;30(3):315-333. doi: 10.1101/gr.258640.119. Epub 2020 Mar 18.

High-throughput DNA sequencing technologies for water and wastewater analysis.

Sci Prog. 2019 Dec;102(4):351-376. doi: 10.1177/0036850419881855. Epub 2019 Oct 15.

Assessment of metagenomic assemblers based on hybrid reads of real and simulated metagenomic sequences.

Brief Bioinform. 2020 May 21;21(3):777-790. doi: 10.1093/bib/bbz025.

Choice of assembly software has a critical impact on virome characterisation.

Microbiome. 2019 Jan 28;7(1):12. doi: 10.1186/s40168-019-0626-5.

ASaiM: a Galaxy-based framework to analyze microbiota data.

Gigascience. 2018 Jun 1;7(6). doi: 10.1093/gigascience/giy057.

Metagenomic assembly through the lens of validation: recent advances in assessing and improving the quality of genomes assembled from metagenomes.

Brief Bioinform. 2019 Jul 19;20(4):1140-1150. doi: 10.1093/bib/bbx098.

Benchmarking viromics: an evaluation of metagenome-enabled estimates of viral community composition and diversity.

PeerJ. 2017 Sep 21;5:e3817. doi: 10.7717/peerj.3817. eCollection 2017.

Assembling metagenomes, one community at a time.

BMC Genomics. 2017 Jul 10;18(1):521. doi: 10.1186/s12864-017-3918-9.

本文引用的文献

metaSPAdes: a new versatile metagenomic assembler.

Genome Res. 2017 May;27(5):824-834. doi: 10.1101/gr.213959.116. Epub 2017 Mar 15.

MOCAT2: a metagenomic assembly, annotation and profiling framework.

Bioinformatics. 2016 Aug 15;32(16):2520-3. doi: 10.1093/bioinformatics/btw183. Epub 2016 Apr 8.

The microbiome quality control project: baseline study design and future directions.

Genome Biol. 2015 Dec 9;16:276. doi: 10.1186/s13059-015-0841-8.

MetaQUAST: evaluation of metagenome assemblies.

Bioinformatics. 2016 Apr 1;32(7):1088-90. doi: 10.1093/bioinformatics/btv697. Epub 2015 Nov 26.

Library preparation methodology can influence genomic and functional predictions in human microbiome research.

Proc Natl Acad Sci U S A. 2015 Nov 10;112(45):14024-9. doi: 10.1073/pnas.1519288112. Epub 2015 Oct 28.

Accurate, multi-kb reads resolve complex populations and detect rare microorganisms.

Genome Res. 2015 Apr;25(4):534-43. doi: 10.1101/gr.183012.114. Epub 2015 Feb 9.

MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph.

Bioinformatics. 2015 May 15;31(10):1674-6. doi: 10.1093/bioinformatics/btv033. Epub 2015 Jan 20.

Omega: an overlap-graph de novo assembler for metagenomics.

Bioinformatics. 2014 Oct;30(19):2717-22. doi: 10.1093/bioinformatics/btu395. Epub 2014 Jun 19.

Trimmomatic: a flexible trimmer for Illumina sequence data.

Bioinformatics. 2014 Aug 1;30(15):2114-20. doi: 10.1093/bioinformatics/btu170. Epub 2014 Apr 1.

Microbiota modulate behavioral and physiological abnormalities associated with neurodevelopmental disorders.

Cell. 2013 Dec 19;155(7):1451-63. doi: 10.1016/j.cell.2013.11.024. Epub 2013 Dec 5.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用特定的微生物群落能够有效评估宏基因组组装。

Utilization of defined microbial communities enables effective evaluation of meta-genomic assemblies.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献