University of Warwick, Coventry CV4 7AL, UK.
Philos Trans R Soc Lond B Biol Sci. 2022 Oct 10;377(1861):20210240. doi: 10.1098/rstb.2021.0240. Epub 2022 Aug 22.
The definition of bacterial species is traditionally a taxonomic issue while bacterial populations are identified by population genetics. These assignments are species specific, and depend on the practitioner. Legacy multilocus sequence typing is commonly used to identify sequence types (STs) and clusters (ST Complexes). However, these approaches are not adequate for the millions of genomic sequences from bacterial pathogens that have been generated since 2012. EnteroBase (http://enterobase.warwick.ac.uk) automatically clusters core genome MLST allelic profiles into hierarchical clusters (HierCC) after assembling annotated draft genomes from short-read sequences. HierCC clusters span core sequence diversity from the species level down to individual transmission chains. Here we evaluate HierCC's ability to correctly assign 100 000s of genomes to the species/subspecies and population levels for and . HierCC assignments were more consistent with maximum-likelihood super-trees of core SNPs or presence/absence of accessory genes than classical taxonomic assignments or 95% ANI. However, neither HierCC nor ANI were uniformly consistent with classical taxonomy of HierCC was also consistent with legacy eBGs/ST Complexes in or and with O serogroups in . Thus, EnteroBase HierCC supports the automated identification of and assignment to species/subspecies and populations for multiple genera. This article is part of a discussion meeting issue 'Genomic population structures of microbial pathogens'.
传统上,细菌种的定义是一个分类学问题,而细菌群体则通过群体遗传学来识别。这些分配是特定于物种的,取决于从业者。传统的多位点序列分型(MLST)常用于识别序列类型(ST)和聚类(ST 复合物)。然而,这些方法对于自 2012 年以来产生的数百万个细菌病原体基因组序列并不足够。EnteroBase(http://enterobase.warwick.ac.uk)在组装来自短读序列的注释草案基因组后,自动将核心基因组 MLST 等位基因谱聚类到层次聚类(HierCC)中。HierCC 聚类跨越了从物种水平到个体传播链的核心序列多样性。在这里,我们评估了 HierCC 将 100000 多个基因组正确分配到物种/亚种和种群水平的能力。对于 和 ,HierCC 分配与核心 SNP 的最大似然超级树或辅助基因的存在/缺失比经典分类学分配或 95%ANI 更一致。然而,无论是 HierCC 还是 ANI,都与经典分类学的一致性不高。HierCC 也与 中的传统 eBGs/ST 复合物以及 和 中的 O 血清群一致。因此,EnteroBase HierCC 支持对多个属的 和 进行自动识别和分配到物种/亚种和种群。本文是关于“微生物病原体的基因组种群结构”讨论会议的一部分。