School of Biological Sciences, University of Queensland, Brisbane, 4072 Queensland, Australia.
BMC Evol Biol. 2012 Jun 28;12:102. doi: 10.1186/1471-2148-12-102.
Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable.
We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses.
We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS.
Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for phylogenetic comparative analyses, particularly for modelling in the face of phylogenetic uncertainty and accounting for measurement error or individual variation in explanatory variables. Code for all models is provided in the BUGS model description language.
比较分析中的不确定性至少有两个来源:a)树拓扑或分支长度的系统发育不确定性,b)由于特征值的种内变异引起的不确定性,这种变异可能是由于测量误差或自然个体变异引起的。大多数系统发育比较方法没有考虑到这些不确定性。不考虑这些不确定性来源会导致对精度的错误感知(置信区间会太窄),并在假设检验中夸大显著性(例如,p 值会太小)。虽然有一些针对特定应用的软件可用于拟合贝叶斯模型,以考虑系统发育错误,但更通用和灵活的软件是理想的。
我们使用贝叶斯框架和马尔可夫链蒙特卡罗分析,开发了模型来直接将系统发育不确定性纳入生物学家通常进行的一系列分析中。
我们演示了线性回归、系统发育信号量化和测量误差模型的应用。通过应用系统发育树的先验分布来纳入系统发育不确定性,其中该分布由贝叶斯系统发育树估计程序的后验树集组成。使用模拟数据集对模型进行了分析,并将其应用于澳大利亚北部雨林植物物种的植物特征的真实数据集。分析使用免费和开源软件 OpenBUGS 和 JAGS 进行。
通过对树的经验先验分布纳入系统发育不确定性,可实现回归模型参数的更精确估计,而不是使用单个共识树,并能更真实地估计置信区间。此外,在贝叶斯框架中,很容易构建包含一个或两个变量中的测量误差和/或个体变异的模型。我们表明,BUGS 是一种有用、灵活的通用工具,可用于系统发育比较分析,特别是在面临系统发育不确定性和解释变量的测量误差或个体变异时进行建模。所有模型的代码都以 BUGS 模型描述语言提供。