Suppr超能文献

“揭露”掩码地址数据:一种中心点地理编码解决方案。

'Unmasking' masked address data: A medoid geocoding solution.

作者信息

Helderop Edward, Nelson Jake R, Grubesic Tony H

机构信息

Center for Geospatial Sciences, School of Public Policy, University of California Riverside.

Department of Geosciences, Auburn University.

出版信息

MethodsX. 2023 Feb 22;10:102090. doi: 10.1016/j.mex.2023.102090. eCollection 2023.

Abstract

In recent years, there has been a consistent push for more open data initiatives, particularly for datasets collected by public agencies or groups that receive public funding. However, there is a tension between the release of open data and the preservation of individual and household privacy, whose balance shifts due to increased data availability, the sophistication of analysis techniques, and the computational power available to users. As a result, data masking is a standard tool used to preserve privacy. This is a process in which the data publishers obfuscate some identifying features in the dataset while attempting to maintain as much accuracy and precision as possible. For spatial datasets, the geocoding of administratively-masked data has been a consistent problem. Here, we present a medoid-based technique that geocodes masked data while minimizing the spatial uncertainty associated with the masking approach. Unfortunately, many commercial geocoding software packages either fail to geocode administratively-masked data or provide false positives by assigning points to city or street centroids. We demonstrate the results of our medoid-based geocoding approach by comparing it to commercial geocoding software. The results suggest that a medoid geocoding approach is mechanically simple to deploy and maximizes the spatial accuracy of the resulting geocodes.•Administratively-masked data are difficult to geocode•A medoid geocoding method maximizes geocoding accuracy•This method outperforms commercial geocoding software.

摘要

近年来,人们一直在持续推动更多的开放数据倡议,特别是针对由公共机构或接受公共资金的团体收集的数据集。然而,开放数据的发布与个人和家庭隐私的保护之间存在矛盾,随着数据可用性的提高、分析技术的复杂性以及用户可用的计算能力的变化,这种平衡也在发生变化。因此,数据掩码是一种用于保护隐私的标准工具。这是一个数据发布者在试图保持尽可能高的准确性和精确性的同时,对数据集中的一些识别特征进行模糊处理的过程。对于空间数据集,行政掩码数据的地理编码一直是个问题。在这里,我们提出一种基于质心的技术,该技术对掩码数据进行地理编码,同时将与掩码方法相关的空间不确定性降至最低。不幸的是,许多商业地理编码软件包要么无法对行政掩码数据进行地理编码,要么通过将点分配到城市或街道中心而提供误报。我们通过将基于质心的地理编码方法与商业地理编码软件进行比较,展示了该方法的结果。结果表明,基于质心的地理编码方法在机械上易于部署,并能使所得地理编码的空间准确性最大化。

•行政掩码数据难以进行地理编码

•基于质心的地理编码方法可使地理编码准确性最大化

•该方法优于商业地理编码软件。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f69b/10006849/c6e43f6c31c0/ga1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验