Takács Bertalan, Jaksa Gábor, Qorri Erda, Gyuris Zoltán, Pintér Lajos, Haracska Lajos
HCEMM-HUN-REN BRC Mutagenesis and Carcinogenesis Research Group, Institute of Genetics, HUN-REN Biological Research Centre, H-6726 Szeged, Hungary.
Faculty of Science and Informatics, Deparment of Biology, Doctoral School of Biology, University of Szeged, H-6720 Szeged, Hungary.
NAR Genom Bioinform. 2025 Jul 4;7(3):lqaf092. doi: 10.1093/nargab/lqaf092. eCollection 2025 Sep.
Microbiome research has expanded rapidly in the last decade due to advances in sequencing technology, resulting in larger and more complex data. This has also led to the development of a plethora of metagenomic classifiers applying different algorithmic principles to classify microorganisms. However, accurate metagenomic classification remains challenging due to false positives and the need for dataset-specific tuning, limiting the comparability of distinct studies and clinical use. In this study, we demonstrate the discrepancy between current, commonly used classifiers and propose a novel classifier, NABAS+ (Novel Alignment-based Biome Analyzing Software+). NABAS+ uses BWA (Burrows-Wheeler aligner) alignment with strict RefSeq curation to ensure one reliable genome per species and filters for genomes with only high-quality reads for precise species-level identification from Illumina shotgun data. The performance of our algorithm and three commonly used classifiers was evaluated on datasets modelling human gastrooral communities, as well as on deeply sequenced microbial community standards. Additionally, we illustrated the usefulness of NABAS+ in detecting pathogens in real-world clinical data. Our results show that NABAS+, due to its extensive alignment process, is superior in accuracy and sensitivity compared to leading microbiome classifiers, particularly in reducing false positives in deep-sequenced microbial samples, making it suitable for clinical diagnosis.
在过去十年中,由于测序技术的进步,微生物组研究迅速发展,产生了规模更大、更复杂的数据。这也催生了大量应用不同算法原理对微生物进行分类的宏基因组分类器。然而,由于假阳性以及需要针对特定数据集进行调整,准确的宏基因组分类仍然具有挑战性,这限制了不同研究之间的可比性以及临床应用。在本研究中,我们展示了当前常用分类器之间的差异,并提出了一种新型分类器NABAS+(基于比对的新型生物群落分析软件升级版)。NABAS+使用BWA(Burrows-Wheeler比对器)进行比对,并严格按照RefSeq标准进行整理,以确保每个物种只有一个可靠的基因组,并对仅包含高质量 reads 的基因组进行筛选,以便从Illumina鸟枪法数据中进行精确的物种水平鉴定。我们的算法和三个常用分类器的性能在模拟人类胃口腔群落的数据集以及深度测序的微生物群落标准数据集上进行了评估。此外,我们还展示了NABAS+在检测实际临床数据中的病原体方面的实用性。我们的结果表明,由于其广泛的比对过程,NABAS+在准确性和灵敏度方面优于领先的微生物组分类器,特别是在减少深度测序微生物样本中的假阳性方面,使其适用于临床诊断。
NAR Genom Bioinform. 2025-7-4
Cochrane Database Syst Rev. 2024-10-17
Cochrane Database Syst Rev. 2022-3-2
Health Technol Assess. 2001
Cochrane Database Syst Rev. 2017-6-6
Cochrane Database Syst Rev. 2016-6-28
Arch Ital Urol Androl. 2025-6-30
Health Technol Assess. 2006-6
PeerJ Comput Sci. 2017
BMJ Case Rep. 2024-6-19
Microorganisms. 2023-10-2
J Mol Biol. 2022-8-15
EBioMedicine. 2021-12