Department of Integrative Biology, University of California, Berkeley, CA 94720, USA.
Syst Biol. 2012 Oct;61(5):793-809. doi: 10.1093/sysbio/sys032. Epub 2012 Feb 14.
In Bayesian divergence time estimation methods, incorporating calibrating information from the fossil record is commonly done by assigning prior densities to ancestral nodes in the tree. Calibration prior densities are typically parametric distributions offset by minimum age estimates provided by the fossil record. Specification of the parameters of calibration densities requires the user to quantify his or her prior knowledge of the age of the ancestral node relative to the age of its calibrating fossil. The values of these parameters can, potentially, result in biased estimates of node ages if they lead to overly informative prior distributions. Accordingly, determining parameter values that lead to adequate prior densities is not straightforward. In this study, I present a hierarchical Bayesian model for calibrating divergence time analyses with multiple fossil age constraints. This approach applies a Dirichlet process prior as a hyperprior on the parameters of calibration prior densities. Specifically, this model assumes that the rate parameters of exponential prior distributions on calibrated nodes are distributed according to a Dirichlet process, whereby the rate parameters are clustered into distinct parameter categories. Both simulated and biological data are analyzed to evaluate the performance of the Dirichlet process hyperprior. Compared with fixed exponential prior densities, the hierarchical Bayesian approach results in more accurate and precise estimates of internal node ages. When this hyperprior is applied using Markov chain Monte Carlo methods, the ages of calibrated nodes are sampled from mixtures of exponential distributions and uncertainty in the values of calibration density parameters is taken into account.
在贝叶斯分歧时间估计方法中,通常通过为树中的祖先节点分配先验密度来合并化石记录中的校准信息。校准先验密度通常是通过化石记录提供的最小年龄估计值来偏移的参数分布。校准密度参数的规范要求用户量化其相对于祖先节点校准化石年龄的先验知识。如果这些参数导致信息量过大的先验分布,则这些参数的值可能导致节点年龄的有偏差估计。因此,确定导致适当先验密度的参数值并不简单。在这项研究中,我提出了一种具有多个化石年龄约束的校准分歧时间分析的分层贝叶斯模型。这种方法将狄利克雷过程先验作为校准先验密度参数的超先验应用。具体来说,该模型假设校准节点上指数先验分布的速率参数根据狄利克雷过程分布,其中速率参数聚类为不同的参数类别。模拟和生物数据都被分析以评估狄利克雷过程超先验的性能。与固定的指数先验密度相比,分层贝叶斯方法可更准确和精确地估计内部节点的年龄。当使用马尔可夫链蒙特卡罗方法应用此超先验时,从指数分布的混合物中对校准节点的年龄进行采样,并考虑校准密度参数值的不确定性。