Aris-Brosou Stéphane, Yang Ziheng
Department of Biology, University College London, Darwin Building, Gower Street, London WC1E 6BT, England.
Syst Biol. 2002 Oct;51(5):703-14. doi: 10.1080/10635150290102375.
The molecular clock, i.e., constancy of the rate of evolution over time, is commonly assumed in estimating divergence dates. However, this assumption is often violated and has drastic effects on date estimation. Recently, a number of attempts have been made to relax the clock assumption. One approach is to use maximum likelihood, which assigns rates to branches and allows the estimation of both rates and times. An alternative is the Bayes approach, which models the change of the rate over time. A number of models of rate change have been proposed. We have extended and evaluated models of rate evolution, i.e., the lognormal and its recent variant, along with the gamma, the exponential, and the Ornstein-Uhlenbeck processes. These models were first applied to a small hominoid data set, where an empirical Bayes approach was used to estimate the hyperparameters that measure the amount of rate variation. Estimation of divergence times was sensitive to these hyperparameters, especially when the assumed model is close to the clock assumption. The rate and date estimates varied little from model to model, although the posterior Bayes factor indicated the Ornstein-Uhlenbeck process outperformed the other models. To demonstrate the importance of allowing for rate change across lineages, this general approach was used to analyze a larger data set consisting of the 18S ribosomal RNA gene of 39 metazoan species. We obtained date estimates consistent with paleontological records, the deepest split within the group being about 560 million years ago. Estimates of the rates were in accordance with the Cambrian explosion hypothesis and suggested some more recent lineage-specific bursts of evolution.
分子钟,即进化速率随时间的恒定性,在估计分歧时间时通常被假定。然而,这一假定常常被违背,并且对时间估计有极大影响。最近,人们进行了一些尝试来放宽时钟假定。一种方法是使用最大似然法,它为分支分配速率,并允许对速率和时间进行估计。另一种方法是贝叶斯方法,它对速率随时间的变化进行建模。已经提出了一些速率变化模型。我们扩展并评估了速率进化模型,即对数正态模型及其最近的变体,以及伽马模型、指数模型和奥恩斯坦 - 乌伦贝克过程。这些模型首先应用于一个小型类人猿数据集,在该数据集中使用经验贝叶斯方法来估计测量速率变化量的超参数。分歧时间的估计对这些超参数很敏感,特别是当假定的模型接近时钟假定时。尽管后验贝叶斯因子表明奥恩斯坦 - 乌伦贝克过程优于其他模型,但速率和时间估计在不同模型之间变化不大。为了证明考虑谱系间速率变化的重要性,这种通用方法被用于分析一个更大的数据集,该数据集由39种后生动物物种的18S核糖体RNA基因组成。我们获得的时间估计与古生物学记录一致,该类群中最深的分歧约在5.6亿年前。速率估计符合寒武纪大爆发假说,并表明了一些近期特定谱系的进化爆发。