Karim Lily M, Martínez-Martínez Francisco José, O'Farrell Ash, Hinrichs Angie S, Sanderson Theo, Iqbal Zamin, Kozyreva Varvara K, Bell John M, López Mariana G, Comas Iñaki, Corbett-Detig Russell
Department of Biomolecular Engineering, University of California, Santa Cruz.
Genomics Institute, University of California, Santa Cruz.
medRxiv. 2025 Jul 23:2025.07.22.25331806. doi: 10.1101/2025.07.22.25331806.
, the bacterium responsible for the Tuberculosis (TB) disease, remains a leading global infectious disease killer, and genomic epidemiology is essential for understanding its transmission dynamics. Computational limitations prevent comprehensive phylogenetic analysis of the publicly available genomes. Here, we create UShER-TB, a comprehensive pipeline for scalable phylogenomic MTB analysis. We processed 129,312 MTB genomes to construct a comprehensive global phylogeny capturing unprecedented genomic diversity. UShER-TB achieved high accuracy in transmission cluster reconstruction. The comprehensive phylogeny also facilitated identification of putative novel lineages and sublineages, and successful placement of ancient DNA samples. The UShER-TB platform enables real-time phylogenomic analysis of new genomes, revealing transmission hotspots and introduction patterns at global scales. Our approach overcomes longstanding computational barriers, providing researchers with efficient tools for TB genomic surveillance especially for resource-limited settings where TB burden is highest.
结核分枝杆菌是导致结核病的病原菌,仍然是全球主要的传染病杀手,而基因组流行病学对于理解其传播动态至关重要。计算能力的限制阻碍了对公开可用基因组进行全面的系统发育分析。在此,我们创建了UShER-TB,这是一个用于可扩展的结核分枝杆菌系统基因组分析的综合流程。我们处理了129,312个结核分枝杆菌基因组,以构建一个全面的全球系统发育树,该树捕捉到了前所未有的基因组多样性。UShER-TB在传播簇重建方面具有很高的准确性。该全面的系统发育树还有助于识别推定的新谱系和亚谱系,并成功定位古代DNA样本。UShER-TB平台能够对新基因组进行实时系统基因组分析,揭示全球范围内的传播热点和引入模式。我们的方法克服了长期存在的计算障碍,为研究人员提供了有效的结核病基因组监测工具,特别是在结核病负担最重的资源有限环境中。