Suppr超能文献

模拟住宅地址地理编码产生的位置误差的概率分布。

Modeling the probability distribution of positional errors incurred by residential address geocoding.

作者信息

Zimmerman Dale L, Fang Xiangming, Mazumdar Soumya, Rushton Gerard

机构信息

Department of Statistics and Actuarial Science and Center for Health Policy and Research, University of Iowa, Iowa City, IA 52242, USA.

出版信息

Int J Health Geogr. 2007 Jan 10;6:1. doi: 10.1186/1476-072X-6-1.

Abstract

BACKGROUND

The assignment of a point-level geocode to subjects' residences is an important data assimilation component of many geographic public health studies. Often, these assignments are made by a method known as automated geocoding, which attempts to match each subject's address to an address-ranged street segment georeferenced within a streetline database and then interpolate the position of the address along that segment. Unfortunately, this process results in positional errors. Our study sought to model the probability distribution of positional errors associated with automated geocoding and E911 geocoding.

RESULTS

Positional errors were determined for 1423 rural addresses in Carroll County, Iowa as the vector difference between each 100%-matched automated geocode and its true location as determined by orthophoto and parcel information. Errors were also determined for 1449 60%-matched geocodes and 2354 E911 geocodes. Huge (> 15 km) outliers occurred among the 60%-matched geocoding errors; outliers occurred for the other two types of geocoding errors also but were much smaller. E911 geocoding was more accurate (median error length = 44 m) than 100%-matched automated geocoding (median error length = 168 m). The empirical distributions of positional errors associated with 100%-matched automated geocoding and E911 geocoding exhibited a distinctive Greek-cross shape and had many other interesting features that were not capable of being fitted adequately by a single bivariate normal or t distribution. However, mixtures of t distributions with two or three components fit the errors very well.

CONCLUSION

Mixtures of bivariate t distributions with few components appear to be flexible enough to fit many positional error datasets associated with geocoding, yet parsimonious enough to be feasible for nascent applications of measurement-error methodology to spatial epidemiology.

摘要

背景

为研究对象的住所分配点位地理编码是许多地理公共卫生研究中重要的数据同化组成部分。通常,这些分配是通过一种称为自动地理编码的方法进行的,该方法试图将每个研究对象的地址与街道线数据库中地理参考的地址范围街道段进行匹配,然后沿着该段内插地址的位置。不幸的是,这个过程会导致位置误差。我们的研究旨在对与自动地理编码和E911地理编码相关的位置误差的概率分布进行建模。

结果

确定了爱荷华州卡罗尔县1423个农村地址的位置误差,即每个100%匹配的自动地理编码与其通过正射影像和地块信息确定的真实位置之间的向量差。还确定了1449个60%匹配的地理编码和2354个E911地理编码的误差。在60%匹配的地理编码误差中出现了巨大(>15公里)的异常值;其他两种类型的地理编码误差中也出现了异常值,但要小得多。E911地理编码比100%匹配的自动地理编码更准确(中位误差长度 = 44米)(中位误差长度 = 168米)。与100%匹配的自动地理编码和E911地理编码相关的位置误差的经验分布呈现出独特的希腊十字形状,并且有许多其他有趣的特征,这些特征不能被单个二元正态或t分布充分拟合。然而,具有两个或三个分量的t分布混合很好地拟合了这些误差。

结论

具有少量分量的二元t分布混合似乎足够灵活,能够拟合许多与地理编码相关的位置误差数据集,但又足够简约,对于测量误差方法在空间流行病学中的新应用来说是可行的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e9f7/1781422/dc57b68a569f/1476-072X-6-1-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验