• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于教育程度登记的微观模拟预测未来的记录链接质量。

Microsimulation of an educational attainment register to predict future record linkage quality.

机构信息

Research Methodology Group, University of Duisburg-Essen, 47057 Duisburg, Germany.

出版信息

Int J Popul Data Sci. 2023 Apr 3;8(1):2122. doi: 10.23889/ijpds.v8i1.2122. eCollection 2023.

DOI:10.23889/ijpds.v8i1.2122
PMID:37649490
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10463005/
Abstract

INTRODUCTION

Population wide educational attainment registers are necessary for educational planning and research. Regular linking of databases is needed to build and update such a register. Without availability of unique national identification numbers, record linkage must be based on quasi-identifiers such as name, date of birth and sex. However, the data protection principle of data minimization aims to minimize the set of identifiers in databases.

OBJECTIVES

Therefore, the German Federal Ministry of Research and Education commissioned a study to inform legislation on the minimum set of identifiers required for a national educational register.

METHODS

To justify our recommendations empirically, we implemented a microsimulation of about 20 million people. The simulated register accumulates changes and errors in identifiers due to migration, regional mobility, marriage, school career and mortality, thereby allowing the study of errors on longitudinal datasets. Updated records were linked yearly to the simulated register using several linkage methods. Clear-text methods as well as privacy-preserving (PPRL) methods were compared.

RESULTS

The results indicate linkage bias if only the primary identifiers are available in the register. More detailed identifiers, including place of birth, are required to minimize linkage bias. The amount of information available to identify a person for matching is more critical for linkage quality than the record linkage method applied. Differences in linkage quality between the best procedures (probabilistic linkage and multiple matchkeys) are minor.

CONCLUSIONS

Microsimulation is a valuable tool for designing record linkage procedures. By modelling the processes resulting in changes or errors in quasi-identifiers, predicting data quality to be expected after the implementation of a register seems possible.

摘要

简介

人口教育程度登记册对于教育规划和研究是必要的。为了建立和更新这样一个登记册,需要定期链接数据库。如果没有可用的唯一国家识别号码,记录链接必须基于准标识符,如姓名、出生日期和性别。然而,数据保护原则的数据最小化旨在最小化数据库中的标识符集。

目的

因此,德国联邦研究与教育部委托进行了一项研究,为国家教育登记册所需的最小标识符集提供立法依据。

方法

为了从经验上证明我们的建议是合理的,我们对大约 2000 万人进行了微模拟。模拟登记册会因迁移、地区流动、婚姻、学业和死亡而累积标识符的变化和错误,从而可以研究纵向数据集上的错误。使用几种链接方法,每年将更新的记录链接到模拟登记册。我们比较了明文方法和隐私保护(PPRL)方法。

结果

结果表明,如果登记册中只有主要标识符,则存在链接偏差。需要更详细的标识符,包括出生地,以最小化链接偏差。用于匹配的标识一个人的信息量比应用的记录链接方法更关键,对链接质量有影响。最佳程序(概率链接和多个匹配键)之间的链接质量差异较小。

结论

微模拟是设计记录链接程序的有用工具。通过模拟导致准标识符发生变化或错误的过程,似乎可以预测在实施登记册后预期的数据质量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17a6/10463005/0495bfc62b9e/ijpds-08-2122-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17a6/10463005/bde63be94f98/ijpds-08-2122-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17a6/10463005/223e72e09349/ijpds-08-2122-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17a6/10463005/40f28a61fe7b/ijpds-08-2122-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17a6/10463005/63fb60e4d4cf/ijpds-08-2122-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17a6/10463005/2ddd8bae7e70/ijpds-08-2122-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17a6/10463005/0495bfc62b9e/ijpds-08-2122-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17a6/10463005/bde63be94f98/ijpds-08-2122-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17a6/10463005/223e72e09349/ijpds-08-2122-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17a6/10463005/40f28a61fe7b/ijpds-08-2122-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17a6/10463005/63fb60e4d4cf/ijpds-08-2122-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17a6/10463005/2ddd8bae7e70/ijpds-08-2122-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17a6/10463005/0495bfc62b9e/ijpds-08-2122-g006.jpg

相似文献

1
Microsimulation of an educational attainment register to predict future record linkage quality.基于教育程度登记的微观模拟预测未来的记录链接质量。
Int J Popul Data Sci. 2023 Apr 3;8(1):2122. doi: 10.23889/ijpds.v8i1.2122. eCollection 2023.
2
On the effectiveness of graph matching attacks against privacy-preserving record linkage.图匹配攻击对隐私保护记录链接有效性的研究。
PLoS One. 2022 Sep 22;17(9):e0267893. doi: 10.1371/journal.pone.0267893. eCollection 2022.
3
Validating a novel deterministic privacy-preserving record linkage between administrative & clinical data: applications in stroke research.验证一种新颖的行政与临床数据确定性隐私保护记录链接方法:在中风研究中的应用。
Int J Popul Data Sci. 2022 Nov 22;7(4):1755. doi: 10.23889/ijpds.v7i4.1755. eCollection 2022.
4
Evaluating privacy-preserving record linkage using cryptographic long-term keys and multibit trees on large medical datasets.在大型医学数据集上使用加密长期密钥和多位树评估隐私保护记录链接。
BMC Med Inform Decis Mak. 2017 Jun 8;17(1):83. doi: 10.1186/s12911-017-0478-5.
5
Privacy-preserving record linkage using Bloom filters.使用布隆过滤器的隐私保护记录链接
BMC Med Inform Decis Mak. 2009 Aug 25;9:41. doi: 10.1186/1472-6947-9-41.
6
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
7
8
Optimization of the Mainzelliste software for fast privacy-preserving record linkage.优化 Mainzelliste 软件以实现快速的隐私保护记录链接。
J Transl Med. 2021 Jan 15;19(1):33. doi: 10.1186/s12967-020-02678-1.
9
Privacy preserving record linkage for public health action: opportunities and challenges.隐私保护的记录链接在公共卫生行动中的机遇与挑战。
J Am Med Inform Assoc. 2024 Nov 1;31(11):2605-2612. doi: 10.1093/jamia/ocae196.
10
Linking education and hospital data in England: linkage process and quality.链接英格兰的教育和医院数据:链接过程和质量。
Int J Popul Data Sci. 2021 Sep 16;6(1):1671. doi: 10.23889/ijpds.v6i1.1671. eCollection 2021.

引用本文的文献

1
Simulated data for census-scale entity resolution research without privacy restrictions: a large-scale dataset generated by individual-based modeling.无隐私限制的普查级实体解析研究的模拟数据:基于个体建模生成的大规模数据集。
Gates Open Res. 2024 Oct 18;8:36. doi: 10.12688/gatesopenres.15418.2. eCollection 2024.

本文引用的文献

1
Privacy preserving linkage using multiple match-keys.使用多个匹配键的隐私保护链接
Int J Popul Data Sci. 2019 May 23;4(1):1094. doi: 10.23889/ijpds.v4i1.1094.
2
Utilising identifier error variation in linkage of large administrative data sources.利用大型行政数据源链接中的标识符错误变异。
BMC Med Res Methodol. 2017 Feb 7;17(1):23. doi: 10.1186/s12874-017-0306-8.
3
A tree-based method for the rapid screening of chemical fingerprints.一种基于树的化学指纹快速筛选方法。
Algorithms Mol Biol. 2010 Jan 4;5:9. doi: 10.1186/1748-7188-5-9.
4
Privacy-preserving record linkage using Bloom filters.使用布隆过滤器的隐私保护记录链接
BMC Med Inform Decis Mak. 2009 Aug 25;9:41. doi: 10.1186/1472-6947-9-41.
5
Where are the Sunday babies? Observations on a marked decline in weekend births in Germany.周日出生的婴儿都去哪儿了?关于德国周末出生人数显著下降的观察。
Naturwissenschaften. 2005 Dec;92(12):592-4. doi: 10.1007/s00114-005-0049-y. Epub 2005 Oct 5.
6
The Swiss solution for anonymously chaining patient files.瑞士对患者档案进行匿名链接的解决方案。
Stud Health Technol Inform. 2001;84(Pt 2):1239-41.