BioMedware, Inc,, 516 North State Street, Ann Arbor, MI, 48104-1236, USA.
Int J Health Geogr. 2009 Oct 28;8:60. doi: 10.1186/1476-072X-8-60.
Although sources of positional error in geographic locations (e.g. geocoding error) used for describing and modeling spatial patterns are widely acknowledged, research on how such error impacts the statistical results has been limited. In this paper we explore techniques for quantifying the perturbability of spatial weights to different specifications of positional error.
We find that a family of curves describes the relationship between perturbability and positional error, and use these curves to evaluate sensitivity of alternative spatial weight specifications to positional error both globally (when all locations are considered simultaneously) and locally (to identify those locations that would benefit most from increased geocoding accuracy). We evaluate the approach in simulation studies, and demonstrate it using a case-control study of bladder cancer in south-eastern Michigan.
Three results are significant. First, the shape of the probability distributions of positional error (e.g. circular, elliptical, cross) has little impact on the perturbability of spatial weights, which instead depends on the mean positional error. Second, our methodology allows researchers to evaluate the sensitivity of spatial statistics to positional accuracy for specific geographies. This has substantial practical implications since it makes possible routine sensitivity analysis of spatial statistics to positional error arising in geocoded street addresses, global positioning systems, LIDAR and other geographic data. Third, those locations with high perturbability (most sensitive to positional error) and high leverage (that contribute the most to the spatial weight being considered) will benefit the most from increased positional accuracy. These are rapidly identified using a new visualization tool we call the LIGA scatterplot.Herein lies a paradox for spatial analysis: For a given level of positional error increasing sample density to more accurately follow the underlying population distribution increases perturbability and introduces error into the spatial weights matrix. In some studies positional error may not impact the statistical results, and in others it might invalidate the results. We therefore must understand the relationships between positional accuracy and the perturbability of the spatial weights in order to have confidence in a study's results.
尽管用于描述和建模空间模式的地理位置(例如地理编码错误)的位置误差源已被广泛认可,但有关此类错误如何影响统计结果的研究却很有限。在本文中,我们探讨了量化空间权重对位置误差不同规范的可扰度的技术。
我们发现,一系列曲线描述了可扰度与位置误差之间的关系,并使用这些曲线来评估替代空间权重规范对位置误差的敏感性,包括全局(同时考虑所有位置)和局部(以识别那些最需要提高地理编码精度的位置)。我们在模拟研究中评估了该方法,并在密歇根州东南部的膀胱癌病例对照研究中进行了演示。
有三个结果是重要的。首先,位置误差的概率分布形状(例如圆形、椭圆形、十字形)对空间权重的可扰度影响不大,而是取决于位置误差的均值。其次,我们的方法允许研究人员评估空间统计对特定地理位置位置精度的敏感性。这具有重要的实际意义,因为它使得对地理编码街道地址、全球定位系统、激光雷达和其他地理数据中出现的位置误差进行空间统计的常规敏感性分析成为可能。第三,那些具有高可扰度(对位置误差最敏感)和高杠杆率(对所考虑的空间权重贡献最大)的位置将从提高位置精度中受益最多。这些位置可以使用我们称为 LIGA 散点图的新可视化工具快速识别。这就产生了空间分析中的一个悖论:对于给定的位置误差水平,增加样本密度以更准确地跟踪基础人口分布会增加可扰度,并在空间权重矩阵中引入误差。在某些研究中,位置误差可能不会影响统计结果,而在其他研究中,位置误差可能会使结果无效。因此,我们必须了解位置精度与空间权重的可扰度之间的关系,才能对研究结果有信心。