Department of Science and Technology, Medical Research Council Centre for Molecular and Cellular Biology, Stellenbosch University, Tygerberg, Cape Town, South Africa.
PLoS One. 2012;7(4):e30593. doi: 10.1371/journal.pone.0030593. Epub 2012 Apr 4.
Mycobacterium tuberculosis complex (MTBC) genomes contain 2 large gene families termed pe and ppe. The function of pe/ppe proteins remains enigmatic but studies suggest that they are secreted or cell surface associated and are involved in bacterial virulence. Previous studies have also shown that some pe/ppe genes are polymorphic, a finding that suggests involvement in antigenic variation. Using comparative sequence analysis of 18 publicly available MTBC whole genome sequences, we have performed alignments of 33 pe (excluding pe_pgrs) and 66 ppe genes in order to detect the frequency and nature of genetic variation. This work has been supplemented by whole gene sequencing of 14 pe/ppe (including 5 pe_pgrs) genes in a cohort of 40 diverse and well defined clinical isolates covering all the main lineages of the M. tuberculosis phylogenetic tree. We show that nsSNP's in pe (excluding pgrs) and ppe genes are 3.0 and 3.3 times higher than in non-pe/ppe genes respectively and that numerous other mutation types are also present at a high frequency. It has previously been shown that non-pe/ppe M. tuberculosis genes display a remarkably low level of purifying selection. Here, we also show that compared to these genes those of the pe/ppe families show a further reduction of selection pressure that suggests neutral evolution. This is inconsistent with the positive selection pressure of "classical" antigenic variation. Finally, by analyzing such a large number of genes we were able to detect large differences in mutation type and frequency between both individual genes and gene sub-families. The high variation rates and absence of selective constraints provides valuable insights into potential pe/ppe function. Since pe/ppe proteins are highly antigenic and have been studied as potential vaccine components these results should also prove informative for aspects of M. tuberculosis vaccine design.
结核分枝杆菌复合群(MTBC)基因组包含两个大型基因家族,分别称为 pe 和 ppe。pe/ppe 蛋白的功能仍然是个谜,但研究表明它们是分泌的或与细胞表面相关的,并且与细菌的毒力有关。先前的研究还表明,一些 pe/ppe 基因是多态的,这一发现表明它们参与了抗原变异。
我们使用 18 个公开的 MTBC 全基因组序列的比较序列分析,对 33 个 pe(不包括 pe_pgrs)和 66 个 ppe 基因进行了比对,以检测遗传变异的频率和性质。这项工作还通过对 40 个多样化且定义明确的临床分离株中的 14 个 pe/ppe(包括 5 个 pe_pgrs)基因进行全基因测序进行了补充,这些分离株涵盖了结核分枝杆菌系统发育树的所有主要谱系。
我们表明,pe(不包括 pgrs)和 ppe 基因中的 nsSNP 分别比非-pe/ppe 基因高 3.0 和 3.3 倍,并且还存在许多其他高频突变类型。先前已经表明,非-pe/ppe 结核分枝杆菌基因显示出非常低水平的纯化选择。在这里,我们还表明,与这些基因相比,pe/ppe 家族的基因受到进一步的选择压力降低,这表明它们处于中性进化状态。这与“经典”抗原变异的正选择压力不一致。
最后,通过分析如此大量的基因,我们能够检测到单个基因和基因亚家族之间在突变类型和频率方面的巨大差异。高变异率和缺乏选择压力为 pe/ppe 的潜在功能提供了有价值的见解。由于 pe/ppe 蛋白具有高度抗原性,并已被研究为潜在的疫苗成分,这些结果也应该为结核分枝杆菌疫苗设计的各个方面提供信息。