Department of Population Medicine and Diagnostic Sciences, College of Veterinary Medicine, Cornell University, Ithaca, New York, USA.
Population and Ecosystem Health, College of Veterinary Medicine, Cornell University, Ithaca, NY, USA.
Microb Genom. 2022 Jun;8(6). doi: 10.1099/mgen.0.000839.
The increased accessibility of next generation sequencing has allowed enough genomes from a given bacterial species to be sequenced to describe the distribution of genes in the pangenome, without limiting analyses to genes present in reference strains. Although some taxa have thousands of whole genome sequences available on public databases, most genomes were sequenced with short read technology, resulting in incomplete assemblies. Studying pangenomes could lead to important insights into adaptation, pathogenicity, or molecular epidemiology, however given the known information loss inherent in analyzing contig-level assemblies, these inferences may be biased or inaccurate. In this study we describe the pangenome of a clonally evolving pathogen, , and examine the utility of gene content variation in outbreak investigation. We constructed the pangenome using 1463 assembled genomes. We tested the assumption of strict clonal evolution by studying evidence of recombination in core genes and analyzing the distribution of accessory genes among core monophyletic groups. To determine if gene content variation could be utilized in outbreak investigation, we carefully examined accessory genes detected in a well described outbreak in Minnesota. We found significant errors in accessory gene classification. After accounting for these errors, we show that has a much smaller accessory genome than previously described and provide evidence supporting ongoing clonal evolution and a closed pangenome, with little gene content variation generated over outbreaks. We also identified frameshift mutations in multiple genes, including a mutation in , which has recently been associated with antibiotic tolerance in . A pangenomic approach enables a more comprehensive analysis of genome dynamics than is possible with reference-based approaches; however, without critical evaluation of accessory gene content, inferences of transmission patterns employing these loci could be misguided.
下一代测序技术的普及使得我们能够对给定细菌物种的足够数量的基因组进行测序,从而描述泛基因组中的基因分布,而无需将分析仅限于参考菌株中存在的基因。尽管一些分类单元在公共数据库中拥有数千个全基因组序列,但大多数基因组都是使用短读长技术进行测序的,导致组装不完整。研究泛基因组可以深入了解适应性、致病性或分子流行病学,但鉴于在分析连续基因组装时存在已知的信息丢失,这些推断可能存在偏差或不准确。在这项研究中,我们描述了一个克隆进化病原体 的泛基因组,并研究了基因含量变异在 爆发调查中的应用。我们使用 1463 个组装基因组构建了 泛基因组。我们通过研究核心基因中的重组证据和分析辅助基因在核心单系群中的分布来检验严格克隆进化的假设。为了确定基因含量变异是否可用于爆发调查,我们仔细检查了在明尼苏达州一次详细描述的 爆发中检测到的辅助基因。我们发现辅助基因分类存在显著错误。在考虑到这些错误后,我们表明 比之前描述的具有更小的辅助基因组,并提供了支持持续克隆进化和封闭泛基因组的证据,在爆发过程中很少产生基因含量变异。我们还在多个基因中发现了移码突变,包括 在 中的突变,最近与 中的抗生素耐受性有关。泛基因组方法能够比基于参考的方法更全面地分析基因组动态;然而,如果不对辅助基因含量进行批判性评估,使用这些基因座推断传播模式可能会产生误导。