Lovasi Gina S, Weiss Jeremy C, Hoskins Richard, Whitsel Eric A, Rice Kenneth, Erickson Craig F, Psaty Bruce M
Columbia University, Institute of Social and Economic Research and Policy, New York, NY, USA.
Int J Health Geogr. 2007 Mar 16;6:12. doi: 10.1186/1476-072X-6-12.
Geocoding methods vary among spatial epidemiology studies. Errors in the geocoding process and differential match rates may reduce study validity. We compared two geocoding methods using 8,157 Washington State addresses. The multi-stage geocoding method implemented by the state health department used a sequence of local and national reference files. The single-stage method used a single national reference file. For each address geocoded by both methods, we measured the distance between the locations assigned by each method. Area-level characteristics were collected from census data, and modeled as predictors of the discordance between geocoded address coordinates.
The multi-stage method had a higher match rate than the single-stage method: 99% versus 95%. Of 7,686 addresses were geocoded by both methods, 96% were geocoded to the same census tract by both methods and 98% were geocoded to locations within 1 km of each other by the two methods. The distance between geocoded coordinates for the same address was higher in sparsely populated and low poverty areas, and counties with local reference files.
The multi-stage geocoding method had a higher match rate than the single-stage method. An examination of differences in the location assigned to the same address suggested that study results may be most sensitive to the choice of geocoding method in sparsely populated or low-poverty areas.
地理编码方法在空间流行病学研究中各不相同。地理编码过程中的错误和不同的匹配率可能会降低研究的有效性。我们使用华盛顿州的8157个地址比较了两种地理编码方法。由州卫生部门实施的多阶段地理编码方法使用了一系列本地和国家参考文件。单阶段方法使用单个国家参考文件。对于两种方法都进行地理编码的每个地址,我们测量了每种方法指定位置之间的距离。从人口普查数据中收集区域层面的特征,并将其建模为地理编码地址坐标不一致的预测因素。
多阶段方法的匹配率高于单阶段方法:分别为99%和95%。在两种方法都进行地理编码的7686个地址中,96%的地址在两种方法下都被地理编码到同一个普查区,98%的地址在两种方法下被地理编码到彼此距离在1公里以内的位置。在人口稀少、贫困率低的地区以及有本地参考文件的县,同一地址的地理编码坐标之间的距离更大。
多阶段地理编码方法的匹配率高于单阶段方法。对分配给同一地址的位置差异进行的检查表明,在人口稀少或贫困率低的地区,研究结果可能对地理编码方法的选择最为敏感。