Suppr超能文献

在大型真实世界数据集上进行隐私保护记录链接。

Privacy-preserving record linkage on large real world datasets.

作者信息

Randall Sean M, Ferrante Anna M, Boyd James H, Bauer Jacqueline K, Semmens James B

机构信息

Centre for Population Health Research, Faculty of Health Sciences, Curtin University, Bentley 6102, WA, Australia.

出版信息

J Biomed Inform. 2014 Aug;50:205-12. doi: 10.1016/j.jbi.2013.12.003. Epub 2013 Dec 9.

Abstract

Record linkage typically involves the use of dedicated linkage units who are supplied with personally identifying information to determine individuals from within and across datasets. The personally identifying information supplied to linkage units is separated from clinical information prior to release by data custodians. While this substantially reduces the risk of disclosure of sensitive information, some residual risks still exist and remain a concern for some custodians. In this paper we trial a method of record linkage which reduces privacy risk still further on large real world administrative data. The method uses encrypted personal identifying information (bloom filters) in a probability-based linkage framework. The privacy preserving linkage method was tested on ten years of New South Wales (NSW) and Western Australian (WA) hospital admissions data, comprising in total over 26 million records. No difference in linkage quality was found when the results were compared to traditional probabilistic methods using full unencrypted personal identifiers. This presents as a possible means of reducing privacy risks related to record linkage in population level research studies. It is hoped that through adaptations of this method or similar privacy preserving methods, risks related to information disclosure can be reduced so that the benefits of linked research taking place can be fully realised.

摘要

记录链接通常涉及使用专门的链接单元,这些单元会被提供个人身份识别信息,以便在数据集内部和跨数据集确定个体。在数据保管人发布之前,提供给链接单元的个人身份识别信息会与临床信息分开。虽然这大大降低了敏感信息泄露的风险,但仍存在一些残余风险,并且仍然是一些保管人的担忧。在本文中,我们试验了一种记录链接方法,该方法在大型真实世界管理数据上进一步降低了隐私风险。该方法在基于概率的链接框架中使用加密的个人身份识别信息(布隆过滤器)。在新南威尔士州(NSW)和西澳大利亚州(WA)十年的医院入院数据上测试了隐私保护链接方法,这些数据总共包含超过2600万条记录。当将结果与使用完整未加密个人标识符的传统概率方法进行比较时,未发现链接质量有差异。这是在人群水平研究中降低与记录链接相关的隐私风险的一种可能手段。希望通过对该方法或类似隐私保护方法的调整,可以降低与信息披露相关的风险,从而充分实现链接研究的益处。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验