Department of Statistics and Actuarial Science, University of Iowa, Iowa City, IA 52242, USA.
Int J Health Geogr. 2010 Feb 16;9:10. doi: 10.1186/1476-072X-9-10.
Automated geocoding of patient addresses for the purpose of conducting spatial epidemiologic studies results in positional errors. It is well documented that errors tend to be larger in rural areas than in cities, but possible effects of local characteristics of the street network, such as street intersection density and street length, on errors have not yet been documented. Our study quantifies effects of these local street network characteristics on the means and the entire probability distributions of positional errors, using regression methods and tolerance intervals/regions, for more than 6000 geocoded patient addresses from an Iowa county.
Positional errors were determined for 6376 addresses in Carroll County, Iowa, as the vector difference between each 100%-matched automated geocode and its ground-truthed location. Mean positional error magnitude was inversely related to proximate street intersection density. This effect was statistically significant for both rural and municipal addresses, but more so for the former. Also, the effect of street segment length on geocoding accuracy was statistically significant for municipal, but not rural, addresses; for municipal addresses mean error magnitude increased with length.
Local street network characteristics may have statistically significant effects on geocoding accuracy in some places, but not others. Even in those locales where their effects are statistically significant, street network characteristics may explain a relatively small portion of the variability among geocoding errors. It appears that additional factors besides rurality and local street network characteristics affect accuracy in general.
为了进行空间流行病学研究而对患者地址进行自动地理编码会导致位置误差。已有大量文献记录表明,农村地区的误差往往大于城市地区,但街道网络的局部特征(如街道交叉口密度和街道长度)对误差的可能影响尚未记录。我们的研究使用回归方法和容限区间/区域,对爱荷华州一个县的 6000 多个经过地理编码的患者地址,量化了这些局部街道网络特征对位置误差均值和整个概率分布的影响。
我们确定了爱荷华州卡罗尔县 6376 个地址的位置误差,方法是将每个 100%匹配的自动地理编码与其地面真实位置之间的向量差。位置误差幅度与邻近街道交叉口密度成反比。这种效应在农村和城市地址中均具有统计学意义,但对前者的影响更为显著。此外,街道段长度对地理编码准确性的影响在城市地址中具有统计学意义,但在农村地址中则没有;对于城市地址,平均误差幅度随长度的增加而增加。
在某些地方,局部街道网络特征可能对地理编码准确性具有统计学意义的影响,但在其他地方则没有。即使在那些其影响具有统计学意义的地方,街道网络特征也可能只能解释地理编码误差变化的相对较小部分。看来,除了农村地区和局部街道网络特征外,还有其他因素会影响整体的准确性。