Université PARIS-EST, Anses, Laboratory for food safety, Maisons-Alfort, France.
BMC Microbiol. 2017 Nov 28;17(1):222. doi: 10.1186/s12866-017-1132-1.
Many of the bacterial genomic studies exploring evolution processes of the host adaptation focus on the accessory genome describing how the gains and losses of genes can explain the colonization of new habitats. Consequently, we developed a new approach focusing on the coregenome in order to describe the host adaptation of Salmonella serovars.
In the present work, we propose bioinformatic tools allowing (i) robust phylogenetic inference based on SNPs and recombination events, (ii) identification of fixed SNPs and InDels distinguishing homoplastic and non-homoplastic coregenome variants, and (iii) gene-ontology enrichment analyses to describe metabolic processes involved in adaptation of Salmonella enterica subsp. enterica to mammalian- (S. Dublin), multi- (S. Enteritidis), and avian- (S. Pullorum and S. Gallinarum) hosts.
The 'VARCall' workflow produced a robust phylogenetic inference confirming that the monophyletic clade S. Dublin diverged from the polyphyletic clade S. Enteritidis which includes the divergent clades S. Pullorum and S. Gallinarum (i). The scripts 'phyloFixedVar' and 'FixedVar' detected non-synonymous and non-homoplastic fixed variants supporting the phylogenetic reconstruction (ii). The scripts 'GetGOxML' and 'EveryGO' identified representative metabolic pathways related to host adaptation using the first gene-ontology enrichment analysis based on bacterial coregenome variants (iii).
We propose in the present manuscript a new coregenome approach coupling identification of fixed SNPs and InDels with regards to inferred phylogenetic clades, and gene-ontology enrichment analysis in order to describe the adaptation of Salmonella serovars Dublin (i.e. mammalian-hosts), Enteritidis (i.e. multi-hosts), Pullorum (i.e. avian-hosts) and Gallinarum (i.e. avian-hosts) at the coregenome scale. All these polyvalent Bioinformatic tools can be applied on other bacterial genus without additional developments.
许多探索宿主适应进化过程的细菌基因组研究都集中在描述基因的获得和丢失如何解释新栖息地的殖民化的附加基因组上。因此,我们开发了一种新的方法,专注于核心基因组,以描述沙门氏菌血清型的宿主适应。
在本工作中,我们提出了生物信息学工具,允许(i)基于 SNP 和重组事件进行稳健的系统发育推断,(ii)识别区分同源和非同源核心基因组变异的固定 SNP 和 InDels,以及(iii)描述沙门氏菌属(S. Dublin)、多宿主(S. Enteritidis)和禽宿主(S. Pullorum 和 S. Gallinarum)适应过程中涉及的代谢过程的基因本体论富集分析。
“VARCall”工作流程产生了稳健的系统发育推断,证实了单系进化枝 S. Dublin 与多系进化枝 S. Enteritidis 分化,后者包括发散进化枝 S. Pullorum 和 S. Gallinarum(i)。脚本“phyloFixedVar”和“FixedVar”检测到非同义的和非同源的固定变异体,支持系统发育重建(ii)。脚本“GetGOxML”和“EveryGO”使用基于细菌核心基因组变异的第一个基因本体论富集分析,鉴定了与宿主适应相关的代表性代谢途径(iii)。
我们在本文中提出了一种新的核心基因组方法,将固定 SNP 和 InDels 的鉴定与推断的系统发育进化枝以及基因本体论富集分析相结合,以描述沙门氏菌血清型 Dublin(即哺乳动物宿主)、Enteritidis(即多宿主)、Pullorum(即禽宿主)和 Gallinarum(即禽宿主)在核心基因组尺度上的适应。所有这些多功能的生物信息学工具都可以在不需要额外开发的情况下应用于其他细菌属。