Didelot Xavier, Siveroni Igor, Volz Erik M
School of Life Sciences, University of Warwick, Coventry, United Kingdom.
Department of Statistics, University of Warwick, Coventry, United Kingdom.
Mol Biol Evol. 2021 Jan 4;38(1):307-317. doi: 10.1093/molbev/msaa193.
Phylogenetic dating is one of the most powerful and commonly used methods of drawing epidemiological interpretations from pathogen genomic data. Building such trees requires considering a molecular clock model which represents the rate at which substitutions accumulate on genomes. When the molecular clock rate is constant throughout the tree then the clock is said to be strict, but this is often not an acceptable assumption. Alternatively, relaxed clock models consider variations in the clock rate, often based on a distribution of rates for each branch. However, we show here that the distributions of rates across branches in commonly used relaxed clock models are incompatible with the biological expectation that the sum of the numbers of substitutions on two neighboring branches should be distributed as the substitution number on a single branch of equivalent length. We call this expectation the additivity property. We further show how assumptions of commonly used relaxed clock models can lead to estimates of evolutionary rates and dates with low precision and biased confidence intervals. We therefore propose a new additive relaxed clock model where the additivity property is satisfied. We illustrate the use of our new additive relaxed clock model on a range of simulated and real data sets, and we show that using this new model leads to more accurate estimates of mean evolutionary rates and ancestral dates.
系统发育年代测定是从病原体基因组数据得出流行病学解释的最强大且常用的方法之一。构建这样的树需要考虑一个分子钟模型,该模型代表了替换在基因组上积累的速率。当分子钟速率在整棵树上恒定时,那么这个钟就被称为严格分子钟,但这通常不是一个可接受的假设。另外,宽松分子钟模型考虑了钟速率的变化,通常基于每个分支的速率分布。然而,我们在此表明,常用宽松分子钟模型中各分支的速率分布与生物学预期不相符,即两个相邻分支上的替换数之和应与等长单个分支上的替换数分布相同。我们将这个预期称为可加性属性。我们进一步展示了常用宽松分子钟模型的假设如何导致进化速率和年代估计的精度较低且置信区间有偏差。因此,我们提出了一种新的满足可加性属性的可加性宽松分子钟模型。我们在一系列模拟和真实数据集上说明了我们新的可加性宽松分子钟模型的使用,并且我们表明使用这个新模型能更准确地估计平均进化速率和祖先年代。