Suppr超能文献

使用调查数据和行政数据进行隐私保护记录链接的方法学评估。

A methodological assessment of privacy preserving record linkage using survey and administrative data.

作者信息

Mirel Lisa B, Resnick Dean M, Aram Jonathan, Cox Christine S

机构信息

Data Linkage Methodology and Analysis Branch, Division of Analysis and Epidemiology, National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, MD, USA.

Statistics and Methodology Department, NORC at the University of Chicago, Bethesda, MD, USA.

出版信息

Stat J IAOS. 2022 Jun 7;38(2):413-421. doi: 10.3233/sji-210891.

Abstract

BACKGROUND

The National Center for Health Statistics (NCHS) links data from surveys to administrative data sources, but privacy concerns make accessing new data sources difficult. Privacy-preserving record linkage (PPRL) is an alternative to traditional linkage approaches that may overcome this barrier. However, prior to implementing PPRL techniques it is important to understand their effect on data quality.

METHODS

Results from PPRL were compared to results from an established linkage method, which uses unencrypted (plain text) identifiers and both deterministic and probabilistic techniques. The established method was used as the gold standard. Links performed with PPRL were evaluated for precision and recall. An initial assessment and a refined approach were implemented. The impact of PPRL on secondary data analysis, including match and mortality rates, was assessed.

RESULTS

The match rates for all approaches were similar, 5.1% for the gold standard, 5.4% for the initial PPRL and 5.0% for the refined PPRL approach. Precision ranged from 93.8% to 98.9% and recall ranged from 98.7% to 97.8%, depending on the selection of tokens from PPRL. The impact of PPRL on secondary data analysis was minimal.

DISCUSSION

The findings suggest PPRL works well to link patient records to the National Death Index (NDI) since both sources have a high level of non-missing personally identifiable information, especially among adults 65 and older who may also have a higher likelihood of linking to the NDI.

CONCLUSION

The results from this study are encouraging for first steps for a statistical agency in the implementation of PPRL approaches, however, future research is still needed.

摘要

背景

美国国家卫生统计中心(NCHS)将调查数据与行政数据源相链接,但隐私问题使得获取新数据源变得困难。隐私保护记录链接(PPRL)是传统链接方法的一种替代方案,可能会克服这一障碍。然而,在实施PPRL技术之前,了解它们对数据质量的影响很重要。

方法

将PPRL的结果与一种既定链接方法的结果进行比较,该既定方法使用未加密(明文)标识符以及确定性和概率性技术。既定方法用作金标准。对使用PPRL执行的链接进行精确性和召回率评估。实施了初步评估和改进方法。评估了PPRL对二次数据分析的影响,包括匹配率和死亡率。

结果

所有方法的匹配率相似,金标准为5.1%,初始PPRL为5.4%,改进后的PPRL方法为5.0%。精确性范围为93.8%至98.9%,召回率范围为98.7%至97.8%,具体取决于从PPRL中选择的令牌。PPRL对二次数据分析的影响最小。

讨论

研究结果表明,PPRL在将患者记录与国家死亡索引(NDI)相链接方面效果良好,因为两个数据源都有大量非缺失的个人身份信息,尤其是在65岁及以上的成年人中,他们与NDI相链接的可能性也可能更高。

结论

这项研究的结果对于统计机构实施PPRL方法的第一步来说是令人鼓舞的,然而,仍需要未来的研究。

相似文献

7

引用本文的文献

9

本文引用的文献

2
Privacy preserving linkage using multiple match-keys.使用多个匹配键的隐私保护链接
Int J Popul Data Sci. 2019 May 23;4(1):1094. doi: 10.23889/ijpds.v4i1.1094.
6

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验