Snelder Ton, Lehmann Anthony, Lamouroux Nicolas, Leathwick John, Allenbach Karin
Biologie des Ecosystèmes Aquatiques, CEMAGREF, Lyon, France.
Environ Manage. 2009 Oct;44(4):658-70. doi: 10.1007/s00267-009-9352-2. Epub 2009 Aug 18.
Numerical clustering has frequently been used to define hierarchically organized ecological regionalizations, but there has been little robust evaluation of their performance (i.e., the degree to which regions discriminate areas with similar ecological character). In this study we investigated the effect of the weighting and treatment of input variables on the performance of regionalizations defined by agglomerative clustering across a range of hierarchical levels. For this purpose, we developed three ecological regionalizations of Switzerland of increasing complexity using agglomerative clustering. Environmental data for our analysis were drawn from a 400 m grid and consisted of estimates of 11 environmental variables for each grid cell describing climate, topography and lithology. Regionalization 1 was defined from the environmental variables which were given equal weights. We used the same variables in Regionalization 2 but weighted and transformed them on the basis of a dissimilarity model that was fitted to land cover composition data derived for a random sample of cells from interpretation of aerial photographs. Regionalization 3 was a further two-stage development of Regionalization 2 where specific classifications, also weighted and transformed using dissimilarity models, were applied to 25 small scale "sub-domains" defined by Regionalization 2. Performance was assessed in terms of the discrimination of land cover composition for an independent set of sites using classification strength (CS), which measured the similarity of land cover composition within classes and the dissimilarity between classes. Regionalization 2 performed significantly better than Regionalization 1, but the largest gains in performance, compared to Regionalization 1, occurred at coarse hierarchical levels (i.e., CS did not increase significantly beyond the 25-region level). Regionalization 3 performed better than Regionalization 2 beyond the 25-region level and CS values continued to increase to the 95-region level. The results show that the performance of regionalizations defined by agglomerative clustering are sensitive to variable weighting and transformation. We conclude that large gains in performance can be achieved by training classifications using dissimilarity models. However, these gains are restricted to a narrow range of hierarchical levels because agglomerative clustering is unable to represent the variation in importance of variables at different spatial scales. We suggest that further advances in the numerical definition of hierarchically organized ecological regionalizations will be possible with techniques developed in the field of statistical modeling of the distribution of community composition.
数值聚类经常被用于定义层次化组织的生态区域划分,但对其性能(即区域划分区分具有相似生态特征区域的程度)的严格评估却很少。在本研究中,我们调查了输入变量的加权和处理对一系列层次水平上通过凝聚聚类定义的区域划分性能的影响。为此,我们使用凝聚聚类法对瑞士进行了三种复杂度递增的生态区域划分。我们分析所用的环境数据来自一个400米的网格,由每个网格单元的11个环境变量估计值组成,这些变量描述了气候、地形和岩性。区域划分1由等权重的环境变量定义。在区域划分2中我们使用相同的变量,但根据一个拟合了从航空照片解译得到的随机抽样像元的土地覆盖组成数据的相异度模型对其进行加权和变换。区域划分3是区域划分2的进一步两阶段发展,其中使用相异度模型进行加权和变换的特定分类被应用于由区域划分2定义的25个小尺度“子区域”。使用分类强度(CS)对一组独立地点的土地覆盖组成的区分能力来评估性能,CS衡量了类内土地覆盖组成的相似性以及类间的相异性。区域划分2的表现明显优于区域划分1,但与区域划分1相比,性能的最大提升出现在较粗的层次水平(即CS在超过25区域水平后没有显著增加)。在超过25区域水平后,区域划分3的表现优于区域划分2,并且CS值持续增加到95区域水平。结果表明,由凝聚聚类定义的区域划分性能对变量加权和变换很敏感。我们得出结论,通过使用相异度模型训练分类可以实现性能的大幅提升。然而,这些提升仅限于较窄的层次水平范围,因为凝聚聚类无法表示不同空间尺度上变量重要性的变化。我们建议,利用群落组成分布统计建模领域开发的技术,在层次化组织的生态区域划分的数值定义方面将有可能取得进一步进展。