Defazio Giuseppe, Tangaro Marco Antonio, Pesole Graziano, Fosso Bruno
Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Via E. Orabona 4, 70126, Bari, Italy.
Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, Consiglio Nazionale delle Ricerche, Via G. Amendola 122/O, 70125, Bari, Italy.
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae680.
The advent of high-throughput sequencing (HTS) technologies unlocked the complexity of the microbial world through the development of metagenomics, which now provides an unprecedented and comprehensive overview of its taxonomic and functional contribution in a huge variety of macro- and micro-ecosystems. In particular, shotgun metagenomics allows the reconstruction of microbial genomes, through the assembly of reads into MAGs (metagenome-assembled genomes). In fact, MAGs represent an information-rich proxy for inferring the taxonomic composition and the functional contribution of microbiomes, even if the relevant analytical approaches are not trivial and still improvable. In this regard, tools like CAMITAX and GTDBtk have implemented complex approaches, relying on marker gene identification and sequence alignments, requiring a large processing time. With the aim of deploying an effective tool for fast and reliable MAG taxonomic classification, we present here kMetaShot, a taxonomy classifier based on k-mer/minimizer counting. We benchmarked kMetaShot against CAMITAX and GTDBtk by using both in silico and real mock communities and demonstrated how, while implementing a fast and concise algorithm, it outperforms the other tools in terms of classification accuracy. Additionally, kMetaShot is an easy-to-install and easy-to-use bioinformatic tool that is also suitable for researchers with few command-line skills. It is available and documented at https://github.com/gdefazio/kMetaShot.
高通量测序(HTS)技术的出现,通过宏基因组学的发展揭示了微生物世界的复杂性,宏基因组学如今为微生物在各种各样的宏观和微观生态系统中的分类学及功能贡献提供了前所未有的全面概述。特别是,鸟枪法宏基因组学能够通过将 reads 组装成 MAGs(宏基因组组装基因组)来重建微生物基因组。事实上,MAGs 是推断微生物群落分类组成和功能贡献的信息丰富的代理指标,即便相关分析方法并非易事且仍有改进空间。在这方面,像 CAMITAX 和 GTDBtk 这样的工具实施了复杂的方法,依赖于标记基因识别和序列比对,需要大量处理时间。为了部署一种用于快速可靠的 MAG 分类学分类的有效工具,我们在此展示 kMetaShot,一种基于 k 元组/最小化器计数的分类器。我们通过使用计算机模拟和真实模拟群落,将 kMetaShot 与 CAMITAX 和 GTDBtk 进行了基准测试,并证明了在实施快速简洁算法的同时,它在分类准确性方面优于其他工具。此外,kMetaShot 是一个易于安装和使用的生物信息学工具,也适合那些命令行技能较少的研究人员。它可在 https://github.com/gdefazio/kMetaShot 获取并附有文档说明。