AMROMICS JSC, Nghe An, Vietnam.
Faculty of IT, Hanoi University of Civil Engineering, Hanoi, Vietnam.
Genome Biol. 2024 Aug 6;25(1):209. doi: 10.1186/s13059-024-03362-z.
Pangenome inference is an indispensable step in bacterial genomics, yet its scalability poses a challenge due to the rapid growth of genomic collections. This paper presents PanTA, a software package designed for constructing pangenomes of large bacterial datasets, showing unprecedented efficiency levels multiple times higher than existing tools. PanTA introduces a novel mechanism to construct the pangenome progressively without rebuilding the accumulated collection from scratch. The progressive mode is shown to consume orders of magnitude less computational resources than existing solutions in managing growing datasets. The software is open source and is publicly available at https://github.com/amromics/panta and at 10.6084/m9.figshare.23724705 .
泛基因组推断是细菌基因组学中不可或缺的一步,但由于基因组序列集的快速增长,其可扩展性仍然是一个挑战。本文介绍了 PanTA,这是一个用于构建大型细菌数据集泛基因组的软件包,它的效率水平比现有工具高出数倍,达到了前所未有的高度。PanTA 引入了一种新的机制,能够在不从头重新构建累积数据集的情况下,逐步构建泛基因组。与现有的处理不断增长的数据集的解决方案相比,渐进模式在管理计算资源方面的消耗要小几个数量级。该软件是开源的,可在 https://github.com/amromics/panta 和 10.6084/m9.figshare.23724705 上获得。