Suppr超能文献

使用插补法为非地理编码地址提供位置信息。

Using imputation to provide location information for nongeocoded addresses.

机构信息

Department of Environmental Health Sciences and Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, United States of America.

出版信息

PLoS One. 2010 Feb 10;5(2):e8998. doi: 10.1371/journal.pone.0008998.

Abstract

BACKGROUND

The importance of geography as a source of variation in health research continues to receive sustained attention in the literature. The inclusion of geographic information in such research often begins by adding data to a map which is predicated by some knowledge of location. A precise level of spatial information is conventionally achieved through geocoding, the geographic information system (GIS) process of translating mailing address information to coordinates on a map. The geocoding process is not without its limitations, though, since there is always a percentage of addresses which cannot be converted successfully (nongeocodable). This raises concerns regarding bias since traditionally the practice has been to exclude nongeocoded data records from analysis.

METHODOLOGY/PRINCIPAL FINDINGS: In this manuscript we develop and evaluate a set of imputation strategies for dealing with missing spatial information from nongeocoded addresses. The strategies are developed assuming a known zip code with increasing use of collateral information, namely the spatial distribution of the population at risk. Strategies are evaluated using prostate cancer data obtained from the Maryland Cancer Registry. We consider total case enumerations at the Census county, tract, and block group level as the outcome of interest when applying and evaluating the methods. Multiple imputation is used to provide estimated total case counts based on complete data (geocodes plus imputed nongeocodes) with a measure of uncertainty. Results indicate that the imputation strategy based on using available population-based age, gender, and race information performed the best overall at the county, tract, and block group levels.

CONCLUSIONS/SIGNIFICANCE: The procedure allows for the potentially biased and likely under reported outcome, case enumerations based on only the geocoded records, to be presented with a statistically adjusted count (imputed count) with a measure of uncertainty that are based on all the case data, the geocodes and imputed nongeocodes. Similar strategies can be applied in other analysis settings.

摘要

背景

地理因素作为健康研究中变异来源的重要性在文献中持续受到关注。在这类研究中纳入地理信息通常始于在地图上添加数据,而这需要一些关于位置的知识。传统上,通过地理编码来实现精确的空间信息水平,这是地理信息系统(GIS)将邮寄地址信息转换为地图上坐标的过程。然而,地理编码过程并非没有其局限性,因为总有一定比例的地址无法成功转换(无法地理编码)。这引起了人们对偏差的关注,因为传统上的做法是将无法地理编码的数据记录排除在分析之外。

方法/主要发现:在本文中,我们开发并评估了一组用于处理无法地理编码地址中缺失空间信息的插补策略。这些策略是在假设已知邮政编码的情况下开发的,并逐渐增加使用抵押信息,即风险人群的空间分布。使用从马里兰州癌症登记处获得的前列腺癌数据来评估策略。当应用和评估方法时,我们将普查县、普查区和普查小区级别的总病例计数视为感兴趣的结果。采用多重插补法,根据完整数据(地理编码加插补的无法地理编码)提供有不确定性度量的估计总病例计数。结果表明,在县、普查区和普查小区各级,基于可用的基于人群的年龄、性别和种族信息的插补策略总体上表现最佳。

结论/意义:该程序允许呈现潜在有偏差且可能报告不足的结果,即仅基于地理编码记录的病例计数,并使用基于所有病例数据(地理编码和插补的无法地理编码)的统计调整计数(插补计数)和不确定性度量来表示。类似的策略可以应用于其他分析环境中。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84f6/2818716/10c506a8db76/pone.0008998.g001.jpg

相似文献

1
Using imputation to provide location information for nongeocoded addresses.
PLoS One. 2010 Feb 10;5(2):e8998. doi: 10.1371/journal.pone.0008998.
3
Estimating the accuracy of geographical imputation.
Int J Health Geogr. 2008 Jan 23;7:3. doi: 10.1186/1476-072X-7-3.
4
Geocoding addresses from a large population-based study: lessons learned.
Epidemiology. 2003 Jul;14(4):399-407. doi: 10.1097/01.EDE.0000073160.79633.c1.
5
Improving geocoding practices: evaluation of geocoding tools.
J Med Syst. 2004 Aug;28(4):361-70. doi: 10.1023/b:joms.0000032851.76239.e3.
6
Quantifying geocode location error using GIS methods.
Environ Health. 2007 Apr 4;6:10. doi: 10.1186/1476-069X-6-10.
7
Post office box addresses: a challenge for geographic information system-based studies.
Epidemiology. 2003 Jul;14(4):386-91. doi: 10.1097/01.EDE.0000073161.66729.89.
8
POINT: Pipeline for Offline Conversion and Integration of Geocodes and Neighborhood Data.
Appl Clin Inform. 2023 Oct;14(5):833-842. doi: 10.1055/a-2148-6414. Epub 2023 Aug 4.
9
An effective and efficient approach for manually improving geocoded data.
Int J Health Geogr. 2008 Nov 26;7:60. doi: 10.1186/1476-072X-7-60.

引用本文的文献

1
A multi-constraint Monte Carlo Simulation approach to downscaling cancer data.
Health Place. 2025 Jan;91:103411. doi: 10.1016/j.healthplace.2024.103411. Epub 2025 Jan 6.
2
The quality of social determinants data in the electronic health record: a systematic review.
J Am Med Inform Assoc. 2021 Dec 28;29(1):187-196. doi: 10.1093/jamia/ocab199.
4
Spatiotemporal Analysis of Oklahoma Tobacco Helpline Registrations Using Geoimputation and Joinpoint Analysis.
J Public Health Manag Pract. 2019 Sep/Oct;25 Suppl 5, Tribal Epidemiology Centers: Advancing Public Health in Indian Country for Over 20 Years(Suppl 5 TRIBAL EPIDEMIOLOGY CENTERS ADVANCING PUBLIC HEALTH IN INDIAN COUNTRY FOR OVER 20 YEARS):S61-S69. doi: 10.1097/PHH.0000000000000996.
5
Geographic Imputation of Missing Activity Space Data from Ecological Momentary Assessment (EMA) GPS Positions.
Int J Environ Res Public Health. 2018 Dec 4;15(12):2740. doi: 10.3390/ijerph15122740.
6
Evaluation of geoimputation strategies in a large case study.
Int J Health Geogr. 2018 Jul 31;17(1):30. doi: 10.1186/s12942-018-0151-y.
7
A geographic information system-based method for estimating cancer rates in non-census defined geographical areas.
Cancer Causes Control. 2017 Oct;28(10):1095-1104. doi: 10.1007/s10552-017-0941-8. Epub 2017 Aug 20.
8
Neighborhood Factors and Fall-Related Injuries among Older Adults Seen by Emergency Medical Service Providers.
Int J Environ Res Public Health. 2017 Feb 8;14(2):163. doi: 10.3390/ijerph14020163.

本文引用的文献

1
Evaluation of the performance of tests for spatial randomness on prostate cancer data.
Int J Health Geogr. 2009 Jul 3;8:41. doi: 10.1186/1476-072X-8-41.
2
An effective and efficient approach for manually improving geocoded data.
Int J Health Geogr. 2008 Nov 26;7:60. doi: 10.1186/1476-072X-7-60.
5
Geocoding accuracy and the recovery of relationships between environmental exposures and health.
Int J Health Geogr. 2008 Apr 3;7:13. doi: 10.1186/1476-072X-7-13.
6
Estimating the accuracy of geographical imputation.
Int J Health Geogr. 2008 Jan 23;7:3. doi: 10.1186/1476-072X-7-3.
8
Estimating the intensity of a spatial point process from locations coarsened by incomplete geocoding.
Biometrics. 2008 Mar;64(1):262-70. doi: 10.1111/j.1541-0420.2007.00870.x. Epub 2007 Aug 3.
9
Quantifying geocode location error using GIS methods.
Environ Health. 2007 Apr 4;6:10. doi: 10.1186/1476-069X-6-10.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验