Earlham Institute, Norwich Research Park, Norwich, UK.
Natural History Museum, London, UK.
Brief Bioinform. 2020 Mar 23;21(2):584-594. doi: 10.1093/bib/bbz020.
In recent years, the use of longer range read data combined with advances in assembly algorithms has stimulated big improvements in the contiguity and quality of genome assemblies. However, these advances have not directly transferred to metagenomic data sets, as assumptions made by the single genome assembly algorithms do not apply when assembling multiple genomes at varying levels of abundance. The development of dedicated assemblers for metagenomic data was a relatively late innovation and for many years, researchers had to make do using tools designed for single genomes. This has changed in the last few years and we have seen the emergence of a new type of tool built using different principles. In this review, we describe the challenges inherent in metagenomic assemblies and compare the different approaches taken by these novel assembly tools.
近年来,长读数据的使用结合组装算法的进步,极大地提高了基因组组装的连续性和质量。然而,这些进展并没有直接应用于宏基因组数据集,因为单基因组组装算法的假设在组装不同丰度的多个基因组时并不适用。专门用于宏基因组数据的组装程序的开发是相对较晚的创新,多年来,研究人员不得不使用专为单基因组设计的工具来完成工作。这种情况在过去几年已经发生了变化,我们已经看到了一种使用不同原理构建的新型工具的出现。在这篇综述中,我们描述了宏基因组组装所固有的挑战,并比较了这些新型组装工具所采用的不同方法。