Department of Bioengineering, University of California San Diego, La Jolla, CA 92039.
Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX 77030.
Proc Natl Acad Sci U S A. 2022 May 3;119(18):e2119396119. doi: 10.1073/pnas.2119396119. Epub 2022 Apr 27.
Combatting Clostridioides difficile infections, a dominant cause of hospital-associated infections with incidence and resulting deaths increasing worldwide, is complicated by the frequent emergence of new virulent strains. Here, we employ whole-genome sequencing, high-throughput phenotypic screenings, and genome-scale models of metabolism to evaluate the genetic diversity of 451 strains of C. difficile. Constructing the C. difficile pangenome based on this set revealed 9,924 distinct gene clusters, of which 2,899 (29%) are defined as core, 2,968 (30%) are defined as unique, and the remaining 4,057 (41%) are defined as accessory. We develop a strain typing method, sequence typing by accessory genome (STAG), that identifies 176 genetically distinct groups of strains and allows for explicit interrogation of accessory gene content. Thirty-five strains representative of the overall set were experimentally profiled on 95 different nutrient sources, revealing 26 distinct growth profiles and unique nutrient preferences; 451 strain-specific genome scale models of metabolism were constructed, allowing us to computationally probe phenotypic diversity in 28,864 unique conditions. The models create a mechanistic link between the observed phenotypes and strain-specific genetic differences and exhibit an ability to correctly predict growth in 76% of measured cases. The typing and model predictions are used to identify and contextualize discriminating genetic features and phenotypes that may contribute to the emergence of new problematic strains.
与全球发病率和死亡率不断上升的医院相关感染有关的主要病原体艰难梭菌感染的治疗较为复杂,因为新的强毒株经常出现。在这里,我们采用全基因组测序、高通量表型筛选和基因组代谢模型来评估 451 株艰难梭菌的遗传多样性。基于这一组数据构建艰难梭菌泛基因组揭示了 9924 个不同的基因簇,其中 2899 个(29%)被定义为核心,2968 个(30%)被定义为独特,其余 4057 个(41%)被定义为辅助。我们开发了一种菌株分型方法,即辅助基因组序列分型(STAG),它可以识别 176 个具有遗传差异的菌株组,并可以明确询问辅助基因的含量。从整个组中选择了 35 株具有代表性的菌株,在 95 种不同的营养源上进行了实验分析,揭示了 26 种不同的生长模式和独特的营养偏好;构建了 451 株特定于菌株的代谢基因组规模模型,使我们能够在 28864 种独特条件下计算探测表型多样性。这些模型在观察到的表型和菌株特异性遗传差异之间建立了一种机制联系,并表现出在 76%的测量情况下正确预测生长的能力。分型和模型预测用于识别和分析可能导致新问题菌株出现的有区别的遗传特征和表型。