College of Life Sciences, Beijing Normal University, No 19 Xinjiekouwai Street, Beijing 100875, China.
BMC Evol Biol. 2012 Sep 7;12:174. doi: 10.1186/1471-2148-12-174.
The Escherichia coli species contains a variety of commensal and pathogenic strains, and its intraspecific diversity is extraordinarily high. With the availability of an increasing number of E. coli strain genomes, a more comprehensive concept of their evolutionary history and ecological adaptation can be developed using phylogenomic analyses. In this study, we constructed two types of whole-genome phylogenies based on 34 E. coli strains using collinear genomic segments. The first phylogeny was based on the concatenated collinear regions shared by all of the studied genomes, and the second phylogeny was based on the variable collinear regions that are absent from at least one genome. Intuitively, the first phylogeny is likely to reveal the lineal evolutionary history among these strains (i.e., an evolutionary phylogeny), whereas the latter phylogeny is likely to reflect the whole-genome similarities of extant strains (i.e., a similarity phylogeny).
Within the evolutionary phylogeny, the strains were clustered in accordance with known phylogenetic groups and phenotypes. When comparing evolutionary and similarity phylogenies, a concept emerges that Shigella may have originated from at least three distinct ancestors and evolved into a single clade. By scrutinizing the properties that are shared amongst Shigella strains but missing in other E. coli genomes, we found that the common regions of the Shigella genomes were mainly influenced by mobile genetic elements, implying that they may have experienced convergent evolution via horizontal gene transfer. Based on an inspection of certain key branches of interest, we identified several collinear regions that may be associated with the pathogenicity of specific strains. Moreover, by examining the annotated genes within these regions, further detailed evidence associated with pathogenicity was revealed.
Collinear regions are reliable genomic features used for phylogenomic analysis among closely related genomes while linking the genomic diversity with phenotypic differences in a meaningful way. The pathogenicity of a strain may be associated with both the arrival of virulence factors and the modification of genomes via mutations. Such phylogenomic studies that compare collinear regions of whole genomes will help to better understand the evolution and adaptation of closely related microbes and E. coli in particular.
大肠杆菌物种包含多种共生和致病菌株,其种内多样性非常高。随着越来越多的大肠杆菌菌株基因组的出现,使用系统基因组分析可以更全面地了解它们的进化历史和生态适应性。在这项研究中,我们使用共线性基因组片段构建了两种基于 34 株大肠杆菌菌株的全基因组系统发育树。第一种系统发育树基于所有研究基因组共有的串联共线性区域,第二种系统发育树基于至少一个基因组缺失的可变共线性区域。直观地说,第一种系统发育树可能揭示这些菌株之间的线性进化历史(即进化系统发育树),而后者可能反映现存菌株的全基因组相似性(即相似系统发育树)。
在进化系统发育树中,菌株根据已知的系统发育组和表型聚类。在比较进化和相似系统发育树时,出现了一个概念,即志贺氏菌可能起源于至少三个不同的祖先,并进化成一个单一的分支。通过仔细研究志贺氏菌菌株共有的但在其他大肠杆菌基因组中缺失的特性,我们发现志贺氏菌基因组的共同区域主要受到移动遗传元件的影响,这表明它们可能通过水平基因转移经历了趋同进化。基于对感兴趣的某些关键分支的检查,我们确定了几个可能与特定菌株致病性相关的共线性区域。此外,通过检查这些区域内的注释基因,揭示了与致病性相关的进一步详细证据。
共线性区域是用于密切相关基因组系统基因组分析的可靠基因组特征,同时以有意义的方式将基因组多样性与表型差异联系起来。菌株的致病性可能与毒力因子的出现以及通过突变对基因组的修饰有关。这种比较全基因组共线性区域的系统基因组研究将有助于更好地了解密切相关微生物的进化和适应,特别是大肠杆菌。