Yang Z, Rannala B
Department of Integrative Biology, University of California, Berkeley 94720-3140, USA.
Mol Biol Evol. 1997 Jul;14(7):717-24. doi: 10.1093/oxfordjournals.molbev.a025811.
An improved Bayesian method is presented for estimating phylogenetic trees using DNA sequence data. The birth-death process with species sampling is used to specify the prior distribution of phylogenies and ancestral speciation times, and the posterior probabilities of phylogenies are used to estimate the maximum posterior probability (MAP) tree. Monte Carlo integration is used to integrate over the ancestral speciation times for particular trees. A Markov Chain Monte Carlo method is used to generate the set of trees with the highest posterior probabilities. Methods are described for an empirical Bayesian analysis, in which estimates of the speciation and extinction rates are used in calculating the posterior probabilities, and a hierarchical Bayesian analysis, in which these parameters are removed from the model by an additional integration. The Markov Chain Monte Carlo method avoids the requirement of our earlier method for calculating MAP trees to sum over all possible topologies (which limited the number of taxa in an analysis to about five). The methods are applied to analyze DNA sequences for nine species of primates, and the MAP tree, which is identical to a maximum-likelihood estimate of topology, has a probability of approximately 95%.
提出了一种改进的贝叶斯方法,用于利用DNA序列数据估计系统发育树。采用带有物种抽样的生死过程来指定系统发育和祖先物种形成时间的先验分布,并使用系统发育的后验概率来估计最大后验概率(MAP)树。蒙特卡罗积分用于对特定树的祖先物种形成时间进行积分。马尔可夫链蒙特卡罗方法用于生成具有最高后验概率的树集。描述了经验贝叶斯分析方法,其中在计算后验概率时使用物种形成和灭绝率的估计值;以及层次贝叶斯分析方法,其中通过额外的积分从模型中去除这些参数。马尔可夫链蒙特卡罗方法避免了我们早期计算MAP树的方法需要对所有可能拓扑进行求和的要求(这将分析中的分类单元数量限制在约五个)。这些方法被应用于分析九种灵长类动物的DNA序列,与拓扑结构的最大似然估计相同的MAP树的概率约为95%。