1Marie Bashir Institute of Infectious Diseases and Biosecurity, Charles Perkins Centre, School of Life and Environmental Sciences and Sydney Medical School, The University of Sydney, Sydney, NSW 2006, Australia.
2Centre for Systems Genomics, The University of Melbourne, Melbourne, VIC 3010, Australia.
Microb Genom. 2016 Nov 30;2(11):e000094. doi: 10.1099/mgen.0.000094. eCollection 2016 Nov.
Estimating the rates at which bacterial genomes evolve is critical to understanding major evolutionary and ecological processes such as disease emergence, long-term host-pathogen associations and short-term transmission patterns. The surge in bacterial genomic data sets provides a new opportunity to estimate these rates and reveal the factors that shape bacterial evolutionary dynamics. For many organisms estimates of evolutionary rate display an inverse association with the time-scale over which the data are sampled. However, this relationship remains unexplored in bacteria due to the difficulty in estimating genome-wide evolutionary rates, which are impacted by the extent of temporal structure in the data and the prevalence of recombination. We collected 36 whole genome sequence data sets from 16 species of bacterial pathogens to systematically estimate and compare their evolutionary rates and assess the extent of temporal structure in the absence of recombination. The majority (28/36) of data sets possessed sufficient clock-like structure to robustly estimate evolutionary rates. However, in some species reliable estimates were not possible even with 'ancient DNA' data sampled over many centuries, suggesting that they evolve very slowly or that they display extensive rate variation among lineages. The robustly estimated evolutionary rates spanned several orders of magnitude, from approximately 10 to 10 nucleotide substitutions per site year. This variation was negatively associated with sampling time, with this relationship best described by an exponential decay curve. To avoid potential estimation biases, such time-dependency should be considered when inferring evolutionary time-scales in bacteria.
估计细菌基因组进化的速率对于理解主要的进化和生态过程至关重要,如疾病的出现、长期的宿主-病原体关联和短期的传播模式。细菌基因组数据集的激增为估计这些速率提供了新的机会,并揭示了塑造细菌进化动态的因素。对于许多生物体来说,进化率的估计与数据采样的时间尺度呈反比关系。然而,由于难以估计全基因组进化率,这种关系在细菌中尚未得到探索,因为进化率受到数据中时间结构的程度和重组的流行度的影响。我们收集了来自 16 种细菌病原体的 36 个全基因组序列数据集,系统地估计和比较了它们的进化率,并在没有重组的情况下评估了时间结构的程度。大多数(28/36)数据集具有足够的钟状结构,可以稳健地估计进化率。然而,在一些物种中,即使使用跨越数百年的“古老 DNA”数据进行采样,也无法获得可靠的估计,这表明它们的进化速度非常缓慢,或者它们在谱系之间显示出广泛的速率变化。稳健估计的进化率跨越了几个数量级,从大约 10 到 10 个核苷酸取代/位点/年。这种变化与采样时间呈负相关,这种关系最好用指数衰减曲线来描述。为了避免潜在的估计偏差,在推断细菌中的进化时间尺度时,应该考虑这种时间依赖性。