Instituto de Microbiologia, Instituto de Microbiologia Molecular, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal.
J Clin Microbiol. 2022 Jun 15;60(6):e0031522. doi: 10.1128/jcm.00315-22. Epub 2022 May 9.
Streptococcus pyogenes is a major human pathogen with high genetic diversity, largely created by recombination and horizontal gene transfer, making it difficult to use single nucleotide polymorphism (SNP)-based genome-wide analyses for surveillance. Using a gene-by-gene approach on 208 complete genomes of S. pyogenes, a novel whole-genome multilocus sequence typing (wgMLST) schema was developed, comprising 3,044 target loci. The schema was used for core-genome MLST (cgMLST) analyses of previously published data sets and 265 newly sequenced draft genomes with other molecular and phenotypic typing data. Clustering based on cgMLST data supported the genetic heterogeneity of many types and correlated poorly with pulsed-field gel electrophoresis macrorestriction profiling, superantigen gene profiling, and MLST sequence type, highlighting the limitations of older typing methods. While 763 loci were present in all isolates of a data set representative of S. pyogenes genetic diversity, the proposed schema allows scalable cgMLST analysis, which can include more loci for an increased resolution when typing closely related isolates. The cgMLST and PopPUNK clusters were broadly consistent in this diverse population. The cgMLST analyses presented results comparable to those of SNP-based methods in the identification of two recently emerged sublineages of 1 and 89 and the clarification of the genetic relatedness among isolates recovered in outbreak contexts. The schema was thoroughly annotated and made publicly available on the chewie-NS online platform (https://chewbbaca.online/species/1/schemas/1), providing a framework for high-resolution typing and analyzing the genetic variability of loci of particular biological interest.
化脓链球菌是一种具有高度遗传多样性的主要人类病原体,主要通过重组和水平基因转移产生,因此很难使用基于单核苷酸多态性(SNP)的全基因组分析进行监测。通过对 208 株化脓链球菌完整基因组进行基因对基因分析,开发了一种新的全基因组多位点序列分型(wgMLST)方案,包括 3044 个靶基因。该方案用于对先前发表的数据集中的核心基因组 MLST(cgMLST)分析和 265 个新测序的草图基因组进行分析,这些草图基因组具有其他分子和表型分型数据。基于 cgMLST 数据的聚类支持了许多 型的遗传异质性,与脉冲场凝胶电泳宏观限制谱分析、超抗原基因谱分析和 MLST 序列型相关性较差,突出了旧的分型方法的局限性。虽然 763 个基因存在于代表化脓链球菌遗传多样性的数据集的所有分离株中,但所提出的方案允许可扩展的 cgMLST 分析,当对密切相关的分离株进行分型时,可以包含更多的基因座以提高分辨率。在这个多样化的人群中,cgMLST 和 PopPUNK 聚类大致一致。cgMLST 分析在鉴定最近出现的 1 型和 89 型两个亚谱系以及澄清爆发背景下分离株的遗传相关性方面,与基于 SNP 的方法产生的结果相当。该方案经过了彻底的注释,并在 chewie-NS 在线平台(https://chewbbaca.online/species/1/schemas/1)上公开提供,为高分辨率分型和分析具有特殊生物学意义的基因座遗传变异性提供了框架。