Department of Integrative Biology, University of California, Berkeley, CA, USA.
University Herbarium and Department of Integrative Biology, University of California, Berkeley, CA, USA.
Syst Biol. 2023 Jun 17;72(3):713-722. doi: 10.1093/sysbio/syad010.
Time-calibrated phylogenetic trees are a tremendously powerful tool for studying evolutionary, ecological, and epidemiological phenomena. Such trees are predominantly inferred in a Bayesian framework, with the phylogeny itself treated as a parameter with a prior distribution (a "tree prior"). However, we show that the tree "parameter" consists, in part, of data, in the form of taxon samples. Treating the tree as a parameter fails to account for these data and compromises our ability to compare among models using standard techniques (e.g., marginal likelihoods estimated using path-sampling and stepping-stone sampling algorithms). Since accuracy of the inferred phylogeny strongly depends on how well the tree prior approximates the true diversification process that gave rise to the tree, the inability to accurately compare competing tree priors has broad implications for applications based on time-calibrated trees. We outline potential remedies to this problem, and provide guidance for researchers interested in assessing the fit of tree models. [Bayes factors; Bayesian model comparison; birth-death models; divergence-time estimation; lineage diversification].
时间校准的系统发育树是研究进化、生态和流行病学现象的极其强大的工具。这些树主要是在贝叶斯框架中推断出来的,系统发育本身被视为具有先验分布的参数(“树先验”)。然而,我们表明,树“参数”部分由以分类群样本形式的数据组成。将树视为参数会忽略这些数据,并影响我们使用标准技术(例如,使用路径采样和踏脚石采样算法估计的边际似然)来比较模型的能力。由于推断出的系统发育的准确性强烈依赖于树先验如何准确地近似产生该树的真实多样化过程,因此无法准确比较竞争树先验对基于时间校准的树的应用具有广泛的影响。我们概述了该问题的潜在补救措施,并为有兴趣评估树模型拟合度的研究人员提供了指导。[贝叶斯因子;贝叶斯模型比较;出生-死亡模型;分歧时间估计;谱系多样化]。