NI Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia.
J Bacteriol. 2013 Jun;195(12):2786-92. doi: 10.1128/JB.02285-12. Epub 2013 Apr 12.
Multiple sequencing of genomes belonging to a bacterial species allows one to analyze and compare statistics and dynamics of the gene complements of species, their pan-genomes. Here, we analyzed multiple genomes of Escherichia coli, Shigella spp., and Salmonella enterica. We demonstrate that the distribution of the number of genomes harboring a gene is well approximated by a sum of two power functions, describing frequent genes (present in many strains) and rare genes (present in few strains). The virtual absence of Shigella-specific genes not present in E. coli genomes confirms previous observations that Shigella is not an independent genus. While the pan-genome size is increasing with each new strain, the number of genes present in a fixed fraction of strains stabilizes quickly. For instance, slightly fewer than 4,000 genes are present in at least half of any group of E. coli genomes. Comparison of S. enterica and E. coli pan-genomes revealed the existence of a common periphery, that is, genes present in some but not all strains of both species. Analysis of phylogenetic trees demonstrates that rare genes from the periphery likely evolve under horizontal transfer, whereas frequent periphery genes may have been inherited from the periphery genome of the common ancestor.
对属于同一细菌物种的多个基因组进行测序,可以分析和比较物种基因组成的统计数据和动态,即它们的泛基因组。在这里,我们分析了多个大肠杆菌、志贺氏菌和沙门氏菌的基因组。我们证明了基因数量的分布可以很好地用两个幂函数的和来近似描述,其中描述了常见基因(存在于许多菌株中)和稀有基因(存在于少数菌株中)。志贺氏菌不存在不在大肠杆菌基因组中出现的特有基因,这与之前的观察结果一致,即志贺氏菌不是一个独立的属。虽然泛基因组的大小随着每个新菌株的增加而增加,但在固定比例的菌株中存在的基因数量很快就稳定下来。例如,在大肠杆菌的任何一组基因组中,至少有 4000 个基因存在。对沙门氏菌和大肠杆菌泛基因组的比较显示存在一个共同的外围,即存在于两个物种的一些但不是所有菌株中的基因。对系统发育树的分析表明,来自外围的稀有基因可能是通过水平转移进化而来的,而常见的外围基因可能是从共同祖先的外围基因组中遗传下来的。