Suppr超能文献

量化用于记录链接的隐私保护字符串比较器的正确性、计算复杂度和安全性。

Quantifying the Correctness, Computational Complexity, and Security of Privacy-Preserving String Comparators for Record Linkage.

作者信息

Durham Elizabeth, Xue Yuan, Kantarcioglu Murat, Malin Bradley

机构信息

Department of Biomedical Informatics, Vanderbilt University, 2525 West End Avenue, Nashville, TN 37203, USA.

出版信息

Inf Fusion. 2012 Oct 1;13(4):245-259. doi: 10.1016/j.inffus.2011.04.004.

Abstract

Record linkage is the task of identifying records from disparate data sources that refer to the same entity. It is an integral component of data processing in distributed settings, where the integration of information from multiple sources can prevent duplication and enrich overall data quality, thus enabling more detailed and correct analysis. Privacy-preserving record linkage (PPRL) is a variant of the task in which data owners wish to perform linkage without revealing identifiers associated with the records. This task is desirable in various domains, including healthcare, where it may not be possible to reveal patient identity due to confidentiality requirements, and in business, where it could be disadvantageous to divulge customers' identities. To perform PPRL, it is necessary to apply string comparators that function in the privacy-preserving space. A number of privacy-preserving string comparators (PPSCs) have been proposed, but little research has compared them in the context of a real record linkage application. This paper performs a principled and comprehensive evaluation of six PPSCs in terms of three key properties: 1) correctness of record linkage predictions, 2) computational complexity, and 3) security. We utilize a real publicly-available dataset, derived from the North Carolina voter registration database, to evaluate the tradeoffs between the aforementioned properties. Among our results, we find that PPSCs that partition, encode, and compare strings yield highly accurate record linkage results. However, as a tradeoff, we observe that such PPSCs are less secure than those that map and compare strings in a reduced dimensional space.

摘要

记录链接是指从不同数据源中识别出指向同一实体的记录的任务。它是分布式环境中数据处理的一个不可或缺的组成部分,在这种环境下,整合来自多个源的信息可以防止数据重复并提高整体数据质量,从而实现更详细、准确的分析。隐私保护记录链接(PPRL)是该任务的一种变体,其中数据所有者希望在不泄露与记录相关的标识符的情况下执行链接。在包括医疗保健在内的各个领域,由于保密要求可能无法透露患者身份,以及在商业领域,泄露客户身份可能不利,因此这项任务很有必要。为了执行PPRL,有必要应用在隐私保护空间中起作用的字符串比较器。已经提出了许多隐私保护字符串比较器(PPSC),但很少有研究在实际记录链接应用的背景下对它们进行比较。本文从三个关键属性方面对六个PPSC进行了有原则的全面评估:1)记录链接预测的正确性,2)计算复杂性,以及3)安全性。我们利用一个从北卡罗来纳州选民登记数据库导出的真实公开可用数据集,来评估上述属性之间的权衡。在我们的结果中,我们发现对字符串进行分区、编码和比较的PPSC产生了高度准确的记录链接结果。然而,作为一种权衡,我们观察到这类PPSC的安全性低于那些在降维空间中映射和比较字符串的PPSC。

相似文献

引用本文的文献

5
Composite Bloom Filters for Secure Record Linkage.用于安全记录链接的复合布隆过滤器
IEEE Trans Knowl Data Eng. 2014 Dec;26(12):2956-2968. doi: 10.1109/TKDE.2013.91.
6
Privacy preserving interactive record linkage (PPIRL).隐私保护交互式记录链接(PPIRL)。
J Am Med Inform Assoc. 2014 Mar-Apr;21(2):212-20. doi: 10.1136/amiajnl-2013-002165. Epub 2013 Nov 7.

本文引用的文献

2
Privacy-preserving record linkage using Bloom filters.使用布隆过滤器的隐私保护记录链接
BMC Med Inform Decis Mak. 2009 Aug 25;9:41. doi: 10.1186/1472-6947-9-41.
6
Some methods for blindfolded record linkage.一些用于盲态记录链接的方法。
BMC Med Inform Decis Mak. 2004 Jun 28;4:9. doi: 10.1186/1472-6947-4-9.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验