Alland David, Whittam Thomas S, Murray Megan B, Cave M Donald, Hazbon Manzour H, Dix Kim, Kokoris Mark, Duesterhoeft Andreas, Eisen Jonathan A, Fraser Claire M, Fleischmann Robert D
Department of Medicine, Center for Emerging Pathogens, New Jersey Medical School, Newark, New Jersey 07103, USA.
J Bacteriol. 2003 Jun;185(11):3392-9. doi: 10.1128/JB.185.11.3392-3399.2003.
The comparative-genomic sequencing of two Mycobacterium tuberculosis strains enabled us to identify single nucleotide polymorphism (SNP) markers for studies of evolution, pathogenesis, and epidemiology in clinical M. tuberculosis. Phylogenetic analysis using these "comparative-genome markers" (CGMs) produced a highly unusual phylogeny with a complete absence of secondary branches. To investigate CGM-based phylogenies, we devised computer models to simulate sequence evolution and calculate new phylogenies based on an SNP format. We found that CGMs represent a distinct class of phylogenetic markers that depend critically on the genetic distances between compared "reference strains." Properly distanced reference strains generate CGMs that accurately depict evolutionary relationships, distorted only by branch collapse. Improperly distanced reference strains generate CGMs that distort and reroot outgroups. Applying this understanding to the CGM-based phylogeny of M. tuberculosis, we found evidence to suggest that this species is highly clonal without detectable lateral gene exchange. We noted indications of evolutionary bottlenecks, including one at the level of the PHRI "C" strain previously associated with particular virulence characteristics. Our evidence also suggests that loss of IS6110 to fewer than seven elements per genome is uncommon. Finally, we present population-based evidence that KasA, an important component of mycolic acid biosynthesis, develops G312S polymorphisms under selective pressure.
对两株结核分枝杆菌进行比较基因组测序,使我们能够鉴定出单核苷酸多态性(SNP)标记,用于临床结核分枝杆菌的进化、发病机制及流行病学研究。利用这些“比较基因组标记”(CGM)进行系统发育分析,得出了一个非常特殊的系统发育树,完全没有二级分支。为了研究基于CGM的系统发育树,我们设计了计算机模型来模拟序列进化,并根据SNP格式计算新的系统发育树。我们发现,CGM代表了一类独特的系统发育标记,其严重依赖于所比较的“参考菌株”之间的遗传距离。距离合适的参考菌株产生的CGM能够准确描绘进化关系,仅因分支塌陷而有所扭曲。距离不合适的参考菌株产生的CGM会扭曲并重新确定外类群的根。将这一认识应用于结核分枝杆菌基于CGM的系统发育研究,我们发现有证据表明该物种高度克隆,没有可检测到的横向基因交换。我们注意到了进化瓶颈的迹象,包括在之前与特定毒力特征相关的PHRI“C”菌株水平上的一个瓶颈。我们的证据还表明,每个基因组中IS6110缺失至少于7个元件的情况并不常见。最后,我们提供了基于群体的证据,表明分枝菌酸生物合成的一个重要成分KasA在选择压力下会发生G312S多态性。