Han Haitao, Wang Ziye, Zhu Shanfeng
Institute of Science and Technology for Brain-Inspired Intelligence and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China.
School of Mathematical Sciences and LPMC, Nankai University, Tianjin, China.
Nat Commun. 2025 Mar 24;16(1):2865. doi: 10.1038/s41467-025-57957-6.
Metagenomic binning is a culture-free approach that facilitates the recovery of metagenome-assembled genomes by grouping genomic fragments. However, there remains a lack of a comprehensive benchmark to evaluate the performance of metagenomic binning tools across various combinations of data types and binning modes. In this study, we benchmark 13 metagenomic binning tools using short-read, long-read, and hybrid data under co-assembly, single-sample, and multi-sample binning, respectively. The benchmark results demonstrate that multi-sample binning exhibits optimal performance across short-read, long-read, and hybrid data. Moreover, multi-sample binning outperforms other binning modes in identifying potential antibiotic resistance gene hosts and near-complete strains containing potential biosynthetic gene clusters across diverse data types. This study also recommends three efficient binners across all data-binning combinations, as well as high-performance binners for each combination.
宏基因组分箱是一种无需培养的方法,通过对基因组片段进行分组来促进宏基因组组装基因组的恢复。然而,目前仍缺乏一个全面的基准来评估宏基因组分箱工具在各种数据类型和分箱模式组合下的性能。在本研究中,我们分别使用短读长、长读长和混合数据,在共组装、单样本和多样本分箱模式下,对13种宏基因组分箱工具进行了基准测试。基准测试结果表明,多样本分箱在短读长、长读长和混合数据上均表现出最佳性能。此外,在识别潜在抗生素抗性基因宿主以及跨不同数据类型包含潜在生物合成基因簇的近完整菌株方面,多样本分箱优于其他分箱模式。本研究还推荐了适用于所有数据分箱组合的三种高效分箱工具,以及适用于每种组合的高性能分箱工具。