Department of Pathology, The Methodist Hospital, Center for Molecular and Translational Human Infectious Diseases Research, The Methodist Hospital Research Institute, Houston, TX 77030, USA.
Proc Natl Acad Sci U S A. 2010 Mar 2;107(9):4371-6. doi: 10.1073/pnas.0911295107. Epub 2010 Feb 8.
Understanding the fine-structure molecular architecture of bacterial epidemics has been a long-sought goal of infectious disease research. We used short-read-length DNA sequencing coupled with mass spectroscopy analysis of SNPs to study the molecular pathogenomics of three successive epidemics of invasive infections involving 344 serotype M3 group A Streptococcus in Ontario, Canada. Sequencing the genome of 95 strains from the three epidemics, coupled with analysis of 280 biallelic SNPs in all 344 strains, revealed an unexpectedly complex population structure composed of a dynamic mixture of distinct clonally related complexes. We discovered that each epidemic is dominated by micro- and macrobursts of multiple emergent clones, some with distinct strain genotype-patient phenotype relationships. On average, strains were differentiated from one another by only 49 SNPs and 11 insertion-deletion events (indels) in the core genome. Ten percent of SNPs are strain specific; that is, each strain has a unique genome sequence. We identified nonrandom temporal-spatial patterns of strain distribution within and between the epidemic peaks. The extensive full-genome data permitted us to identify genes with significantly increased rates of nonsynonymous (amino acid-altering) nucleotide polymorphisms, thereby providing clues about selective forces operative in the host. Comparative expression microarray analysis revealed that closely related strains differentiated by seemingly modest genetic changes can have significantly divergent transcriptomes. We conclude that enhanced understanding of bacterial epidemics requires a deep-sequencing, geographically centric, comparative pathogenomics strategy.
了解细菌流行的精细结构分子结构一直是传染病研究的长期目标。我们使用短读长 DNA 测序并结合 SNP 的质谱分析,研究了涉及加拿大安大略省 344 株 M3 组 A 链球菌的连续 3 次侵袭性感染流行的分子病原体组学。对来自 3 次流行的 95 株菌进行基因组测序,并对所有 344 株菌的 280 个双等位基因 SNP 进行分析,揭示了一种出乎意料的复杂种群结构,由动态混合的不同克隆相关复合物组成。我们发现,每次流行都由多个新兴克隆的微观和宏观爆发所主导,其中一些与独特的菌株基因型-患者表型关系有关。平均而言,菌株之间仅通过 49 个 SNP 和 11 个核心基因组中的插入缺失事件(indels)来区分。10%的 SNP 是菌株特异性的;也就是说,每个菌株都有独特的基因组序列。我们确定了在流行高峰期内和之间菌株分布的非随机时空模式。广泛的全基因组数据使我们能够识别出具有显著增加的非同义(改变氨基酸)核苷酸多态性的基因,从而提供了有关宿主中作用的选择压力的线索。比较表达微阵列分析显示,看似遗传变化微小的密切相关菌株的转录组可能存在显著差异。我们得出结论,要深入了解细菌流行,需要采用深度测序、以地理为中心、比较病原体组学策略。