National Food Institute, Building 204, The Technical University of Denmark,2800 Kgs Lyngby, Denmark
BMC Genomics. 2012 Mar 12;13:88. doi: 10.1186/1471-2164-13-88.
Technological advances in high throughput genome sequencing are making whole genome sequencing (WGS) available as a routine tool for bacterial typing. Standardized procedures for identification of relevant genes and of variation are needed to enable comparison between studies and over time. The core genes--the genes that are conserved in all (or most) members of a genus or species--are potentially good candidates for investigating genomic variation in phylogeny and epidemiology.
We identify a set of 2,882 core genes clusters based on 73 publicly available Salmonella enterica genomes and evaluate their value as typing targets, comparing whole genome typing and traditional methods such as 16S and MLST. A consensus tree based on variation of core genes gives much better resolution than 16S and MLST; the pan-genome family tree is similar to the consensus tree, but with higher confidence. The core genes can be divided into two categories: a few highly variable genes and a larger set of conserved core genes, with low variance. For the most variable core genes, the variance in amino acid sequences is higher than for the corresponding nucleotide sequences, suggesting that there is a positive selection towards mutations leading to amino acid changes.
Genomic variation within the core genome is useful for investigating molecular evolution and providing candidate genes for bacterial genome typing. Identification of genes with different degrees of variation is important especially in trend analysis.
高通量基因组测序技术的进步使全基因组测序(WGS)成为细菌分型的常规工具。需要标准化的识别相关基因和变异的程序,以实现研究之间和随时间的比较。核心基因——在属或种的所有(或大多数)成员中保守的基因——是研究基因组在系统发育和流行病学中的变异的潜在良好候选基因。
我们基于 73 个公开的沙门氏菌属基因组确定了一组 2882 个核心基因簇,并评估了它们作为分型靶标的价值,将全基因组分型与传统方法(如 16S 和 MLST)进行比较。基于核心基因变异的共识树比 16S 和 MLST 提供了更好的分辨率;泛基因组家族树与共识树相似,但置信度更高。核心基因可以分为两类:少数高度变异的基因和一组变异较小的保守核心基因。对于最易变的核心基因,氨基酸序列的变异高于相应的核苷酸序列,这表明存在一种正向选择,导致氨基酸发生变化的突变。
核心基因组内的基因组变异可用于研究分子进化,并为细菌基因组分型提供候选基因。鉴定具有不同变异程度的基因非常重要,特别是在趋势分析中。