Aboal J R, Real C, Fernández J A, Carballeira A
Area de Ecología, Universidad de Santiago de Compostela, 15782 Santiago de Compostela, Spain.
Sci Total Environ. 2006 Mar 1;356(1-3):256-74. doi: 10.1016/j.scitotenv.2005.04.025. Epub 2005 Jun 4.
In this paper we discuss some difficulties associated with the process of constructing maps of pollution from data obtained in surveys covering extensive areas. As we show here, these problems may be wide-ranging but are seldom recognized by investigators. The origin of the problems is the existence of multiple sources of pollution in the study area, each of different intensity and affecting areas of different extent. The particular spatial structure of the pollution sources interacts with the spatial layout of the samples, resulting in data sets with distributions that are very different from the usually assumed normal distribution, and characterized by heavy tails and gross outliers. These distributions arise because of incomplete sampling of small-scale pollution processes (i.e. those occurring on a spatial scale smaller than the spatial scale of the sampling grid). After discussion of the potential problems and appropriate techniques for analyzing this kind of data, we applied the proposed techniques to a real data set of heavy metal contents in terrestrial mosses. From the exercise we concluded that a) the first step in analysis of this kind of data must be to check for the presence of spatial structure on scales larger than the sampling grid, to avoid mapping noise, and b) the map generated must not contain information about pollution sources with a spatial scale smaller than the spatial scale of the sampling grid. We present and discuss the performance of robust statistical methods of testing for spatial structure (based on robust variograms and randomization testing) and of filtering the small-scale spatial processes (using median-polishing) prior to mapping.
在本文中,我们讨论了在利用覆盖大面积区域的调查数据构建污染地图的过程中所涉及的一些困难。正如我们在此所展示的,这些问题可能范围广泛,但调查人员却很少认识到。问题的根源在于研究区域内存在多种污染源,每种污染源的强度不同,影响的区域范围也不同。污染源的特定空间结构与样本的空间布局相互作用,导致数据集的分布与通常假定的正态分布有很大差异,其特征是具有厚尾和严重的异常值。这些分布的出现是由于对小规模污染过程(即那些发生在空间尺度小于采样网格空间尺度的过程)的采样不完整。在讨论了潜在问题和分析这类数据的适当技术之后,我们将所提出的技术应用于陆地苔藓中重金属含量的实际数据集。通过这个实践,我们得出结论:a)分析这类数据的第一步必须是检查大于采样网格尺度的空间结构的存在情况,以避免绘制噪声;b)生成的地图不得包含有关空间尺度小于采样网格空间尺度的污染源的信息。我们展示并讨论了用于测试空间结构(基于稳健变差函数和随机化测试)以及在绘图之前过滤小规模空间过程(使用中位数平滑法)的稳健统计方法的性能。