Department of Biology, University of York, York YO10 5DD, UK.
Department of Molecular Biology and Genetics, Aarhus University, 8000 Aarhus, Denmark.
Genes (Basel). 2021 Jan 18;12(1):111. doi: 10.3390/genes12010111.
Bacteria currently included in are too diverse to be considered a single species, so we can refer to this as a species complex (the Rlc). We have found 429 publicly available genome sequences that fall within the Rlc and these show that the Rlc is a distinct entity, well separated from other species in the genus. Its sister taxon is . We constructed a phylogeny based on concatenated sequences of 120 universal (core) genes, and calculated pairwise average nucleotide identity (ANI) between all genomes. From these analyses, we concluded that the Rlc includes 18 distinct genospecies, plus 7 unique strains that are not placed in these genospecies. Each genospecies is separated by a distinct gap in ANI values, usually at approximately 96% ANI, implying that it is a 'natural' unit. Five of the genospecies include the type strains of named species: , and itself. The 16S ribosomal RNA sequence is remarkably diverse within the Rlc, but does not distinguish the genospecies. Partial sequences of housekeeping genes, which have frequently been used to characterize isolate collections, can mostly be assigned unambiguously to a genospecies, but alleles within a genospecies do not always form a clade, so single genes are not a reliable guide to the true phylogeny of the strains. We conclude that access to a large number of genome sequences is a powerful tool for characterizing the diversity of bacteria, and that taxonomic conclusions should be based on all available genome sequences, not just those of type strains.
目前归入 的细菌多样性太大,不能被视为单一物种,因此我们可以将其视为一个物种复合体(Rlc)。我们发现了 429 个可公开获得的基因组序列,这些序列属于 Rlc,表明 Rlc 是一个独特的实体,与属内的其他物种明显分开。它的姊妹分类群是 。我们基于 120 个通用(核心)基因的串联序列构建了系统发育树,并计算了所有基因组之间的成对平均核苷酸同一性(ANI)。从这些分析中,我们得出结论,Rlc 包括 18 个不同的基因种,加上 7 个未归入这些基因种的独特菌株。每个基因种之间的 ANI 值都有明显的差距,通常约为 96% ANI,这意味着它是一个“自然”的单位。其中 5 个基因种包括已命名物种的模式株: 、 、 和 本身。Rlc 内的 16S 核糖体 RNA 序列非常多样化,但不能区分基因种。常用于描述分离株集的管家基因的部分序列大多可以明确分配到一个基因种,但一个基因种内的等位基因并不总是形成一个分支,因此单个基因不是菌株真实系统发育的可靠指南。我们得出的结论是,获得大量基因组序列是描述细菌多样性的有力工具,分类结论应基于所有可用的基因组序列,而不仅仅是模式株的基因组序列。