The Macaulay Institute, Craigiebuckler, Aberdeen, AB15 8QH, UK.
Ecol Lett. 2010 Feb;13(2):246-64. doi: 10.1111/j.1461-0248.2009.01422.x.
Many of the most interesting questions ecologists ask lead to analyses of spatial data. Yet, perhaps confused by the large number of statistical models and fitting methods available, many ecologists seem to believe this is best left to specialists. Here, we describe the issues that need consideration when analysing spatial data and illustrate these using simulation studies. Our comparative analysis involves using methods including generalized least squares, spatial filters, wavelet revised models, conditional autoregressive models and generalized additive mixed models to estimate regression coefficients from synthetic but realistic data sets, including some which violate standard regression assumptions. We assess the performance of each method using two measures and using statistical error rates for model selection. Methods that performed well included generalized least squares family of models and a Bayesian implementation of the conditional auto-regressive model. Ordinary least squares also performed adequately in the absence of model selection, but had poorly controlled Type I error rates and so did not show the improvements in performance under model selection when using the above methods. Removing large-scale spatial trends in the response led to poor performance. These are empirical results; hence extrapolation of these findings to other situations should be performed cautiously. Nevertheless, our simulation-based approach provides much stronger evidence for comparative analysis than assessments based on single or small numbers of data sets, and should be considered a necessary foundation for statements of this type in future.
许多生态学家提出的最有趣的问题都涉及到空间数据分析。然而,由于可用的统计模型和拟合方法数量众多,许多生态学家似乎认为这最好留给专家。在这里,我们描述了在分析空间数据时需要考虑的问题,并使用模拟研究来说明这些问题。我们的比较分析包括使用包括广义最小二乘法、空间滤波器、小波修正模型、条件自回归模型和广义加性混合模型在内的方法,从包括一些违反标准回归假设的合成但现实数据集估计回归系数。我们使用两种度量标准和统计错误率来评估每种方法的性能,以进行模型选择。表现良好的方法包括广义最小二乘模型家族和条件自回归模型的贝叶斯实现。在没有模型选择的情况下,普通最小二乘法也表现得相当好,但它的 I 型错误率控制得很差,因此在使用上述方法进行模型选择时,并没有表现出性能的提高。去除响应中的大尺度空间趋势会导致性能不佳。这些都是经验结果;因此,应谨慎将这些发现外推到其他情况。尽管如此,我们基于模拟的方法为比较分析提供了比基于单个或少数数据集的评估更有力的证据,并且应该被视为未来此类声明的必要基础。