Rohlf F J
Department of Ecology and Evolution, State University of New York at Stony Brook, 11794-5245, USA.
Evolution. 2001 Nov 11;55(11):2143-60. doi: 10.1111/j.0014-3820.2001.tb00731.x.
This study is concerned with statistical methods used for the analysis of comparative data (in which observations are not expected to be independent because they are sampled across phylogenetically related species). The phylogenetically independent contrasts (PIC), phylogenetic generalized least-squares (PGLS), and phylogenetic autocorrelation (PA) methods are compared. Although the independent contrasts are not orthogonal, they are independent if the data conform to the Brownian motion model of evolution on which they are based. It is shown that uncentered correlations and regressions through the origin using the PIC method are identical to those obtained using PGLS with an intercept included in the model. The PIC method is a special case of PGLS. Corrected standard errors are given for estimates of the ancestral states based on the PGLS approach. The treatment of trees with hard polytomies is discussed and is shown to be an algorithmic rather than a statistical problem. Some of the relationships among the methods are shown graphically using the multivariate space in which variables are represented as vectors with respect to OTUs used as coordinate axes. The maximum-likelihood estimate of the autoregressive parameter, p, has not been computed correctly in previous studies (an appendix with MATLAB code provides a corrected algorithm). The importance of the eigenvalues and eigenvectors of the connection matrix, W, for the distribution of p is discussed. The PA method is shown to have several problems that limit its usefulness in comparative studies. Although the PA method is a generalized least-squares procedure, it cannot be made equivalent to the PGLS method using a phylogenetic model.
本研究关注用于分析比较数据的统计方法(在这类数据中,由于是在系统发育相关物种中进行抽样,观测值预计并非独立)。对系统发育独立对比(PIC)、系统发育广义最小二乘法(PGLS)和系统发育自相关(PA)方法进行了比较。尽管独立对比并非正交,但如果数据符合其基于的布朗运动进化模型,那么它们就是独立的。结果表明,使用PIC方法的非中心化相关性和过原点回归与在模型中包含截距的PGLS方法所得到的结果相同。PIC方法是PGLS的一种特殊情况。给出了基于PGLS方法的祖先状态估计的校正标准误差。讨论了具有硬多歧点的树的处理方法,并表明这是一个算法问题而非统计问题。使用多变量空间以图形方式展示了这些方法之间的一些关系,在该空间中,变量相对于用作坐标轴的操作分类单元(OTU)表示为向量。在先前的研究中,自回归参数p的最大似然估计计算有误(一个带有MATLAB代码的附录提供了校正算法)。讨论了连接矩阵W的特征值和特征向量对p分布的重要性。结果表明,PA方法存在若干问题,限制了其在比较研究中的实用性。尽管PA方法是一种广义最小二乘法程序,但使用系统发育模型时它无法等同于PGLS方法。