Lepage Thomas, Bryant David, Philippe Hervé, Lartillot Nicolas
Department of Mathematics and Statistics, McGill University, Montréal, Québec, Canada.
Mol Biol Evol. 2007 Dec;24(12):2669-80. doi: 10.1093/molbev/msm193. Epub 2007 Sep 21.
Several models have been proposed to relax the molecular clock in order to estimate divergence times. However, it is unclear which model has the best fit to real data and should therefore be used to perform molecular dating. In particular, we do not know whether rate autocorrelation should be considered or which prior on divergence times should be used. In this work, we propose a general bench mark of alternative relaxed clock models. We have reimplemented most of the already existing models, including the popular lognormal model, as well as various prior choices for divergence times (birth-death, Dirichlet, uniform), in a common Bayesian statistical framework. We also propose a new autocorrelated model, called the "CIR" process, with well-defined stationary properties. We assess the relative fitness of these models and priors, when applied to 3 different protein data sets from eukaryotes, vertebrates, and mammals, by computing Bayes factors using a numerical method called thermodynamic integration. We find that the 2 autocorrelated models, CIR and lognormal, have a similar fit and clearly outperform uncorrelated models on all 3 data sets. In contrast, the optimal choice for the divergence time prior is more dependent on the data investigated. Altogether, our results provide useful guidelines for model choice in the field of molecular dating while opening the way to more extensive model comparisons.
为了估计分歧时间,已经提出了几种放松分子钟的模型。然而,尚不清楚哪种模型最适合实际数据,因此应用于分子年代测定。特别是,我们不知道是否应考虑速率自相关,或者应使用哪种分歧时间先验。在这项工作中,我们提出了一个比较不同放松时钟模型的通用基准。我们在一个通用的贝叶斯统计框架中重新实现了大多数现有的模型,包括流行的对数正态模型,以及分歧时间的各种先验选择(生死、狄利克雷、均匀)。我们还提出了一种新的自相关模型,称为“CIR”过程,具有明确的平稳特性。当应用于来自真核生物、脊椎动物和哺乳动物的3个不同蛋白质数据集时,我们通过使用一种称为热力学积分的数值方法计算贝叶斯因子,评估这些模型和先验的相对拟合度。我们发现,这两种自相关模型,即CIR和对数正态模型,具有相似的拟合度,并且在所有3个数据集上明显优于非自相关模型。相比之下,分歧时间先验的最佳选择更多地取决于所研究的数据。总之,我们的结果为分子年代测定领域的模型选择提供了有用的指导方针,同时为更广泛的模型比较开辟了道路。