American Institutes for Research, Arlington, VA 22202, United States.
National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA 30341, United States.
J Am Med Inform Assoc. 2024 Nov 1;31(11):2605-2612. doi: 10.1093/jamia/ocae196.
To understand the landscape of privacy preserving record linkage (PPRL) applications in public health, assess estimates of PPRL accuracy and privacy, and evaluate factors for PPRL adoption.
A literature scan examined the accuracy, data privacy, and scalability of PPRL in public health. Twelve interviews with subject matter experts were conducted and coded using an inductive approach to identify factors related to PPRL adoption.
PPRL has a high level of linkage quality and accuracy. PPRL linkage quality was comparable to that of clear text linkage methods (requiring direct personally identifiable information [PII]) for linkage across various settings and research questions. Accuracy of PPRL depended on several components, such as PPRL technique, and the proportion of missingness and errors in underlying data. Strategies to increase adoption include increasing understanding of PPRL, improving data owner buy-in, establishing governance structure and oversight, and developing a public health implementation strategy for PPRL.
PPRL protects privacy by eliminating the need to share PII for linkage, but the accuracy and linkage quality depend on factors including the choice of PPRL technique and specific PII used to create encrypted identifiers. Large-scale implementations of PPRL linking millions of observations-including PCORnet, National Institutes for Health N3C, and the Centers for Disease Control and Prevention COVID-19 project have demonstrated the scalability of PPRL for public health applications.
Applications of PPRL in public health have demonstrated their value for the public health community. Although gaps must be addressed before wide implementation, PPRL is a promising solution to data linkage challenges faced by the public health ecosystem.
了解公共卫生中隐私保护记录链接(PPRL)应用的现状,评估 PPRL 的准确性和隐私保护估计值,并评估 PPRL 采用的因素。
文献扫描检查了 PPRL 在公共卫生中的准确性、数据隐私和可扩展性。对 12 名主题专家进行了访谈,并采用归纳法对访谈内容进行编码,以确定与 PPRL 采用相关的因素。
PPRL 具有较高的链接质量和准确性。PPRL 的链接质量与明文链接方法(需要直接的个人身份信息 [PII])相当,适用于各种环境和研究问题。PPRL 的准确性取决于 PPRL 技术、基础数据中缺失和错误的比例等多个因素。增加采用 PPRL 的策略包括增加对 PPRL 的理解、提高数据所有者的认可、建立治理结构和监督,以及制定 PPRL 在公共卫生中的实施策略。
PPRL 通过消除链接所需的 PII 共享来保护隐私,但准确性和链接质量取决于 PPRL 技术的选择以及用于创建加密标识符的特定 PII 等因素。大规模实施 PPRL 链接数百万个观察值——包括 PCORnet、美国国立卫生研究院 N3C 和疾病预防控制中心 COVID-19 项目——已经证明了 PPRL 对公共卫生应用的可扩展性。
公共卫生中 PPRL 的应用已经证明了其对公共卫生社区的价值。尽管在广泛实施之前必须解决一些差距,但 PPRL 是公共卫生生态系统面临的数据链接挑战的一个有前途的解决方案。