Department of Computer Science and Engineering, University of California, San Diego, CA, USA.
Cell Wall Biology and Utilization Laboratory, Dairy Forage Research Center, USDA, Madison, WI, USA.
Nat Methods. 2020 Nov;17(11):1103-1110. doi: 10.1038/s41592-020-00971-x. Epub 2020 Oct 5.
Long-read sequencing technologies have substantially improved the assemblies of many isolate bacterial genomes as compared to fragmented short-read assemblies. However, assembling complex metagenomic datasets remains difficult even for state-of-the-art long-read assemblers. Here we present metaFlye, which addresses important long-read metagenomic assembly challenges, such as uneven bacterial composition and intra-species heterogeneity. First, we benchmarked metaFlye using simulated and mock bacterial communities and show that it consistently produces assemblies with better completeness and contiguity than state-of-the-art long-read assemblers. Second, we performed long-read sequencing of the sheep microbiome and applied metaFlye to reconstruct 63 complete or nearly complete bacterial genomes within single contigs. Finally, we show that long-read assembly of human microbiomes enables the discovery of full-length biosynthetic gene clusters that encode biomedically important natural products.
长读测序技术与碎片化的短读测序组装相比,大大提高了许多分离细菌基因组的组装质量。然而,即使是最先进的长读测序组装器,组装复杂的宏基因组数据集仍然具有挑战性。在这里,我们介绍了 metaFlye,它解决了长读宏基因组组装的一些重要挑战,如细菌组成不均匀和种内异质性。首先,我们使用模拟和模拟细菌群落对 metaFlye 进行了基准测试,结果表明它始终能比最先进的长读测序组装器产生具有更高完整性和连续性的组装结果。其次,我们对绵羊微生物组进行了长读测序,并应用 metaFlye 在单个连续体中重建了 63 个完整或几乎完整的细菌基因组。最后,我们表明,人类微生物组的长读组装能够发现全长生物合成基因簇,这些基因簇编码具有重要生物医学意义的天然产物。