短读宏基因组组装评估。

Evaluation of short read metagenomic assembly.

机构信息

Computer Science Department, George Mason University, Fairfax, Virginia, USA.

出版信息

BMC Genomics. 2011;12 Suppl 2(Suppl 2):S8. doi: 10.1186/1471-2164-12-S2-S8. Epub 2011 Jul 27.

DOI:10.1186/1471-2164-12-S2-S8

PMID:21989307

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3194239/

Abstract

BACKGROUND

Metagenomic assembly is a challenging problem due to the presence of genetic material from multiple organisms. The problem becomes even more difficult when short reads produced by next generation sequencing technologies are used. Although whole genome assemblers are not designed to assemble metagenomic samples, they are being used for metagenomics due to the lack of assemblers capable of dealing with metagenomic samples. We present an evaluation of assembly of simulated short-read metagenomic samples using a state-of-art de Bruijn graph based assembler.

RESULTS

We assembled simulated metagenomic reads from datasets of various complexities using a state-of-art de Bruijn graph based parallel assembler. We have also studied the effect of k-mer size used in de Bruijn graph on metagenomic assembly and developed a clustering solution to pool the contigs obtained from different assembly runs, which allowed us to obtain longer contigs. We have also assessed the degree of chimericity of the assembled contigs using an entropy/impurity metric and compared the metagenomic assemblies to assemblies of isolated individual source genomes.

CONCLUSIONS

Our results show that accuracy of the assembled contigs was better than expected for the metagenomic samples with a few dominant organisms and was especially poor in samples containing many closely related strains. Clustering contigs from different k-mer parameter of the de Bruijn graph allowed us to obtain longer contigs, however the clustering resulted in accumulation of erroneous contigs thus increasing the error rate in clustered contigs.

摘要

背景

由于存在来自多种生物体的遗传物质，宏基因组组装是一个具有挑战性的问题。当使用下一代测序技术产生的短读长时，问题变得更加困难。尽管全基因组组装器不是为组装宏基因组样本而设计的，但由于缺乏能够处理宏基因组样本的组装器，因此它们被用于宏基因组学。我们使用基于最先进的 de Bruijn 图的组装器评估了模拟短读长宏基因组样本的组装。

结果

我们使用基于最先进的 de Bruijn 图的并行组装器，从各种复杂程度的数据集组装模拟的宏基因组读长。我们还研究了 de Bruijn 图中使用的 k-mer 大小对宏基因组组装的影响，并开发了一种聚类解决方案来汇集来自不同组装运行的 contigs，这使我们能够获得更长的 contigs。我们还使用熵/不纯度度量评估了组装 contigs 的嵌合程度，并将宏基因组组装与单独的源基因组组装进行了比较。

结论

我们的结果表明，对于少数优势生物体的宏基因组样本，组装 contigs 的准确性优于预期，而对于包含许多密切相关菌株的样本则尤其差。从 de Bruijn 图的不同 k-mer 参数聚类 contigs 允许我们获得更长的 contigs，但聚类会导致错误 contigs 的积累，从而增加聚类 contigs 的错误率。

相似文献

Evaluation of short read metagenomic assembly.短读宏基因组组装评估。

BMC Genomics. 2011;12 Suppl 2(Suppl 2):S8. doi: 10.1186/1471-2164-12-S2-S8. Epub 2011 Jul 27.

Fragmentation and Coverage Variation in Viral Metagenome Assemblies, and Their Effect in Diversity Calculations.病毒宏基因组组装中的碎片化和覆盖度变化，及其对多样性计算的影响。

Front Bioeng Biotechnol. 2015 Sep 17;3:141. doi: 10.3389/fbioe.2015.00141. eCollection 2015.

Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches.用于纳米孔数据的从头组装算法基准测试揭示了重叠布局一致（OLC）方法的最佳性能。

BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):507. doi: 10.1186/s12864-016-2895-8.

Improved assemblies using a source-agnostic pipeline for MetaGenomic Assembly by Merging (MeGAMerge) of contigs.通过对重叠群进行宏基因组组装合并（MeGAMerge），使用与源无关的流程改进组装。

Sci Rep. 2014 Oct 1;4:6480. doi: 10.1038/srep06480.

MetaVelvet-SL: an extension of the Velvet assembler to a de novo metagenomic assembler utilizing supervised learning.MetaVelvet-SL：Velvet序列拼接软件向利用监督学习的从头宏基因组序列拼接软件的扩展。

DNA Res. 2015 Feb;22(1):69-77. doi: 10.1093/dnares/dsu041. Epub 2014 Nov 27.

Paired de bruijn graphs: a novel approach for incorporating mate pair information into genome assemblers.配对德布鲁因图：一种将配对末端信息整合到基因组组装工具中的新方法。

J Comput Biol. 2011 Nov;18(11):1625-34. doi: 10.1089/cmb.2011.0151. Epub 2011 Oct 14.

Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut.比较不同的组装和注释工具在分析肠道中模拟病毒宏基因组群落中的应用。

BMC Genomics. 2014 Jan 18;15:37. doi: 10.1186/1471-2164-15-37.

Assessment of metagenomic assembly using simulated next generation sequencing data.基于模拟下一代测序数据的宏基因组组装评估。

PLoS One. 2012;7(2):e31386. doi: 10.1371/journal.pone.0031386. Epub 2012 Feb 23.

LMAS: evaluating metagenomic short de novo assembly methods through defined communities.LMAS：通过定义的群落评估宏基因组短从头组装方法。

Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giac122.

Meta-IDBA: a de Novo assembler for metagenomic data.Meta-IDBA：一种用于宏基因组数据的从头组装程序。

Bioinformatics. 2011 Jul 1;27(13):i94-101. doi: 10.1093/bioinformatics/btr216.

引用本文的文献

MAGNETO: An Automated Workflow for Genome-Resolved Metagenomics.MAGNETO：基因组解析宏基因组学的自动化工作流程。

mSystems. 2022 Aug 30;7(4):e0043222. doi: 10.1128/msystems.00432-22. Epub 2022 Jun 15.

Assessment of metagenomic assemblers based on hybrid reads of real and simulated metagenomic sequences.基于真实和模拟宏基因组序列混合读取的宏基因组组装器评估。

Brief Bioinform. 2020 May 21;21(3):777-790. doi: 10.1093/bib/bbz025.

Statistical correction for functional metagenomic profiling of a microbial community with short NGS reads.利用短读长NGS数据对微生物群落进行功能宏基因组分析的统计校正

J Appl Stat. 2018;45(14):2521-2535. doi: 10.1080/02664763.2018.1426741. Epub 2018 Jan 27.

Evaluating the Performance of De Novo Assembly Methods for Venom-Gland Transcriptomics.评估从头组装方法在毒液腺转录组学中的性能。

Toxins (Basel). 2018 Jun 19;10(6):249. doi: 10.3390/toxins10060249.

Is there foul play in the leaf pocket? The metagenome of floating fern Azolla reveals endophytes that do not fix N but may denitrify.叶腋中是否存在猫腻？漂浮蕨类满江红的宏基因组揭示了不固氮但可能反硝化的内生菌。

New Phytol. 2018 Jan;217(1):453-466. doi: 10.1111/nph.14843. Epub 2017 Oct 30.

Assembling metagenomes, one community at a time.一次组装一个群落的宏基因组。

BMC Genomics. 2017 Jul 10;18(1):521. doi: 10.1186/s12864-017-3918-9.

Assessment of Common and Emerging Bioinformatics Pipelines for Targeted Metagenomics.针对靶向宏基因组学的常见及新兴生物信息学流程评估

PLoS One. 2017 Jan 4;12(1):e0169563. doi: 10.1371/journal.pone.0169563. eCollection 2017.

Translational metagenomics and the human resistome: confronting the menace of the new millennium.转化宏基因组学与人类抗性组：应对新千年的威胁

J Mol Med (Berl). 2017 Jan;95(1):41-51. doi: 10.1007/s00109-016-1478-0. Epub 2016 Oct 20.

Recovering complete and draft population genomes from metagenome datasets.从宏基因组数据集中恢复完整和草图的种群基因组。

Microbiome. 2016 Mar 8;4:8. doi: 10.1186/s40168-016-0154-5.

Identification and Resolution of Microdiversity through Metagenomic Sequencing of Parallel Consortia.通过平行群落的宏基因组测序鉴定和解决微多样性

Appl Environ Microbiol. 2015 Oct 23;82(1):255-67. doi: 10.1128/AEM.02274-15. Print 2016 Jan 1.

本文引用的文献

Metagenomics: Facts and Artifacts, and Computational Challenges*.宏基因组学：事实与假象以及计算挑战*

J Comput Sci Technol. 2009 Jan;25(1):71-81. doi: 10.1007/s11390-010-9306-4.

Metagenomic sequencing of an in vitro-simulated microbial community.微生物群落体外模拟的宏基因组测序。

PLoS One. 2010 Apr 16;5(4):e10209. doi: 10.1371/journal.pone.0010209.

A human gut microbial gene catalogue established by metagenomic sequencing.宏基因组测序建立的人类肠道微生物基因目录。

Nature. 2010 Mar 4;464(7285):59-65. doi: 10.1038/nature08821.

Short clones or long clones? A simulation study on the use of paired reads in metagenomics.短克隆还是长克隆？宏基因组学中使用配对reads 的模拟研究。

BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S12. doi: 10.1186/1471-2105-11-S1-S12.

Assembly complexity of prokaryotic genomes using short reads.使用短读长组装原核基因组的复杂性。

BMC Bioinformatics. 2010 Jan 12;11:21. doi: 10.1186/1471-2105-11-21.

Sequencing technologies - the next generation.测序技术——下一代。

Nat Rev Genet. 2010 Jan;11(1):31-46. doi: 10.1038/nrg2626. Epub 2009 Dec 8.

Phymm and PhymmBL: metagenomic phylogenetic classification with interpolated Markov models.Phymm和PhymmBL：基于插值马尔可夫模型的宏基因组系统发育分类

Nat Methods. 2009 Sep;6(9):673-6. doi: 10.1038/nmeth.1358. Epub 2009 Aug 2.

Visual and statistical comparison of metagenomes.元基因组的可视化和统计比较。

Bioinformatics. 2009 Aug 1;25(15):1849-55. doi: 10.1093/bioinformatics/btp341. Epub 2009 Jun 10.

Genome assembly reborn: recent computational challenges.基因组组装重生：近期的计算挑战

Brief Bioinform. 2009 Jul;10(4):354-66. doi: 10.1093/bib/bbp026. Epub 2009 May 29.

Fast and accurate short read alignment with Burrows-Wheeler transform.使用Burrows-Wheeler变换进行快速准确的短读比对。

Bioinformatics. 2009 Jul 15;25(14):1754-60. doi: 10.1093/bioinformatics/btp324. Epub 2009 May 18.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

短读宏基因组组装评估。

Evaluation of short read metagenomic assembly.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献