Adams Dean C, Collyer Michael L
Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, Iowa, 50011; Department of Statistics, Iowa State University, Ames, Iowa, 50011.
Evolution. 2015 Mar;69(3):823-9. doi: 10.1111/evo.12596. Epub 2015 Feb 6.
Evaluating statistical trends in high-dimensional phenotypes poses challenges for comparative biologists, because the high-dimensionality of the trait data relative to the number of species can prohibit parametric tests from being computed. Recently, two comparative methods were proposed to circumvent this difficulty. One obtains phylogenetic independent contrasts for all variables, and statistically evaluates the linear model by permuting the phylogenetically independent contrasts (PICs) of the response data. The other uses a distance-based approach to obtain coefficients for generalized least squares models (D-PGLS), and subsequently permutes the original data to evaluate the model effects. Here, we show that permuting PICs is not equivalent to permuting the data prior to the analyses as in D-PGLS. We further explain why PICs are not the correct exchangeable units under the null hypothesis, and demonstrate that this misspecification of permutable units leads to inflated type I error rates of statistical tests. We then show that simply shuffling the original data and recalculating the independent contrasts with each iteration yields significance levels that correspond to those found using D-PGLS. Thus, while summary statistics from methods based on PICs and PGLS are the same, permuting PICs can lead to strikingly different inferential outcomes with respect to statistical and biological inferences.
评估高维表型中的统计趋势给比较生物学家带来了挑战,因为相对于物种数量而言,性状数据的高维度可能会使参数检验无法进行计算。最近,人们提出了两种比较方法来规避这一难题。一种方法是为所有变量获取系统发育独立对比,并通过对响应数据的系统发育独立对比(PIC)进行置换来对线性模型进行统计评估。另一种方法使用基于距离的方法来获取广义最小二乘模型(D-PGLS)的系数,随后对原始数据进行置换以评估模型效果。在此,我们表明对PIC进行置换并不等同于像在D-PGLS中那样在分析之前对数据进行置换。我们进一步解释了为什么在原假设下PIC不是正确的可交换单元,并证明这种对可置换单元的错误设定会导致统计检验的I型错误率膨胀。然后我们表明,简单地对原始数据进行洗牌并在每次迭代时重新计算独立对比会产生与使用D-PGLS所得到的显著性水平相对应的结果。因此,虽然基于PIC和PGLS的方法所得到的汇总统计量相同,但对PIC进行置换在统计和生物学推断方面可能会导致截然不同的推断结果。