Department of Biology, Syracuse University, Syracuse, New York, United States of America.
PLoS One. 2009 Nov 23;4(11):e7957. doi: 10.1371/journal.pone.0007957.
Comparing patterns of divergence among separate lineages or groups has posed an especially difficult challenge for biologists. Recently a new, conceptually simple methodology called the "ordered-axis plot" approach was introduced for the purpose of comparing patterns of diversity in a common morphospace. This technique involves a combination of principal components analysis (PCA) and linear regression. Given the common use of these statistics the potential for the widespread use of the ordered axis approach is high. However, there are a number of drawbacks to this approach, most notably that lineages with the greatest amount of variance will largely bias interpretations from analyses involving a common morphospace. Therefore, without meeting a set of a priori requirements regarding data structure the ordered-axis plot approach will likely produce misleading results.
METHODOLOGY/PRINCIPAL FINDINGS: Morphological data sets from cichlid fishes endemic to Lakes Tanganyika, Malawi, and Victoria were used to statistically demonstrate how separate groups can have differing contributions to a common morphospace produced by a PCA. Through a matrix superimposition of eigenvectors (scale-free trajectories of variation identified by PCA) we show that some groups contribute more to the trajectories of variation identified in a common morphospace. Furthermore, through a set of randomization tests we show that a common morphospace model partitions variation differently than group-specific models. Finally, we demonstrate how these limitations may influence an ordered-axis plot approach by performing a comparison on data sets with known alterations in covariance structure. Using these results we provide a set of criteria that must be met before a common morphospace can be reliably used.
CONCLUSIONS/SIGNIFICANCE: Our results suggest that a common morphospace produced by PCA would not be useful for producing biologically meaningful results unless a restrictive set of criteria are met. We therefore suggest biologists be aware of the limitations of the ordered-axis plot approach before employing it on their own data, and possibly consider other, less restrictive methods for addressing the same question.
比较不同谱系或群体的分歧模式对生物学家来说是一个特别具有挑战性的问题。最近,一种新的、概念上简单的方法,称为“有序轴图”方法,被引入用于比较共同形态空间中的多样性模式。这种技术涉及主成分分析(PCA)和线性回归的组合。由于这些统计数据的广泛使用,有序轴方法的广泛应用潜力很高。然而,这种方法有许多缺点,最主要的是,具有最大方差的谱系将在很大程度上偏向于涉及共同形态空间的分析解释。因此,如果不满足关于数据结构的一组先验要求,有序轴图方法可能会产生误导性结果。
方法/主要发现:使用来自坦噶尼喀湖、马拉维湖和维多利亚湖特有的慈鲷鱼类的形态数据集,从统计学上证明了不同的群体如何对 PCA 产生的共同形态空间产生不同的贡献。通过特征向量矩阵叠加(PCA 识别的无标度变化轨迹),我们表明一些群体对共同形态空间中确定的变化轨迹的贡献更大。此外,通过一系列随机化检验,我们表明共同形态空间模型对变化的划分与特定于群体的模型不同。最后,我们通过对具有已知协方差结构变化的数据进行比较,展示了这些限制如何影响有序轴图方法。利用这些结果,我们提供了一组在可以可靠地使用共同形态空间之前必须满足的标准。
结论/意义:我们的结果表明,除非满足一组严格的标准,否则 PCA 产生的共同形态空间将无法用于产生有生物学意义的结果。因此,我们建议生物学家在使用自己的数据之前,了解有序轴图方法的局限性,并可能考虑其他限制较少的方法来解决相同的问题。