Shaw Jim, Marin Maximillian G, Li Heng
Department of Data Science, Dana-Farber Cancer Institute.
Department of Biomedical Informatics, Harvard Medical School.
bioRxiv. 2025 Sep 6:2025.09.05.674543. doi: 10.1101/2025.09.05.674543.
Long-read metagenome assembly promises complete genomic recovery from microbiomes. However, the complexity of metagenomes poses challenges. We present myloasm, a metagenome assembler for PacBio HiFi and Oxford Nanopore Technologies (ONT) R10.4 long reads. Myloasm uses polymorphic k-mers to construct a high-resolution string graph and then leverages differential abundance for graph simplification. On real-world ONT metagenomes, myloasm assembled three times more complete circular contigs than the next-best assembler. Myloasm can make ONT and HiFi comparable for assembly: for a jointly sequenced gut metagenome, myloasm with ONT assembled more complete circular genomes than any assembler with HiFi. Myloasm recovers previously inaccessible within-species diversity; we recovered six complete single-contig genomes from a gut metagenome and eight complete TM7 (Saccharibacteria) contigs with > 93% similarity from an oral metagenome. With this improved resolution, we resolved two 98% similar antibiotic resistance genes spreading through distinct strain-specific mobile genetic elements in a human gut.
长读长宏基因组组装有望从微生物群落中完整恢复基因组。然而,宏基因组的复杂性带来了挑战。我们展示了Myloasm,一种用于PacBio HiFi和牛津纳米孔技术(ONT)R10.4长读长的宏基因组组装器。Myloasm使用多态性k-mer构建高分辨率字符串图,然后利用差异丰度简化图。在真实世界的ONT宏基因组上,Myloasm组装的完整环状重叠群比次优组装器多两倍。Myloasm可以使ONT和HiFi在组装方面具有可比性:对于联合测序的肠道宏基因组,使用ONT的Myloasm组装的完整环状基因组比任何使用HiFi的组装器都多。Myloasm恢复了以前无法获得的种内多样性;我们从肠道宏基因组中恢复了6个完整的单重叠群基因组,从口腔宏基因组中恢复了8个相似度>93%的完整TM7(糖细菌)重叠群。有了这种提高的分辨率,我们解析了两个相似度为98%的抗生素抗性基因,它们通过不同的菌株特异性移动遗传元件在人类肠道中传播。