Suppr超能文献

基于间接个人标识符的确定性链接方法的评估:索赔和癌症登记数据的链接。

Record linkage of claims and cancer registries data-Evaluation of a deterministic linkage approach based on indirect personal identifiers.

机构信息

Department of Biometry and Data Management, Leibniz-Institute for Prevention Research and Epidemiology-BIPS, Bremen, Germany.

Cancer Registry of Bremen, Leibniz Institute for Prevention Research and Epidemiology-BIPS, Bremen, Germany.

出版信息

Pharmacoepidemiol Drug Saf. 2022 Dec;31(12):1287-1293. doi: 10.1002/pds.5545. Epub 2022 Oct 6.

Abstract

PURPOSE

In Germany, record linkage of claims and cancer registry data is cost- and time-consuming, since up until recently no unique personal identifier was available in both data sources. The aim of this study was to evaluate the feasibility and performance of a deterministic linkage procedure based on indirect personal identifiers included in the data sources.

METHODS

We identified users of glucose-lowering drugs with residence in four federal states in Northern and Southern Germany (Bavaria, Bremen, Hamburg, Lower Saxony) in the German Pharmacoepidemiological Research Database (GePaRD) and assessed colorectal and thyroid cancer cases. Cancer registries of the federal states selected all colorectal and thyroid cancer cases between 2004 and 2015. A deterministic linkage approach was performed based on indirect personal identifiers such as year of birth, sex, area of residence, type of cancer and an absolute difference between the dates of cancer diagnosis in both data sources of at most 90 days. Results were compared to a probabilistic linkage using "direct" personal identifiers (gold standard).

RESULTS

The deterministic linkage procedure yielded a sensitivity of 71.8% for colorectal cancer and 66.6% for thyroid cancer. For thyroid cancer, the sensitivity improved when using only inpatient diagnosis to define cancer in GePaRD (71.4%). Specificity was always above 99%. Using the probabilistic linkage to define cancer cases, the risk for colorectal cancer was estimated 10 percentage points lower than when using the deterministic approach.

CONCLUSIONS

Sensitivity of the deterministic linkage approach appears to be too low to be considered as reasonable alternative to the probabilistic linkage procedure.

摘要

目的

在德国,由于索赔数据和癌症登记数据来源中直到最近都没有唯一的个人标识符,因此记录链接既耗费成本又耗费时间。本研究的目的是评估基于数据源中包含的间接个人标识符的确定性链接程序的可行性和性能。

方法

我们在德国 Pharmacoepidemiological Research Database(GePaRD)中确定了居住在德国北部和南部四个联邦州(巴伐利亚州、不来梅、汉堡、下萨克森州)的使用降血糖药物的用户,并评估了结直肠癌和甲状腺癌病例。所选联邦州的癌症登记处登记了 2004 年至 2015 年间所有结直肠癌和甲状腺癌病例。基于间接个人标识符(如出生日期、性别、居住地区、癌症类型以及两个数据源中癌症诊断日期之间的最大绝对差异 90 天),采用确定性链接方法进行链接。结果与使用“直接”个人标识符(黄金标准)的概率性链接进行了比较。

结果

对于结直肠癌,确定性链接程序的灵敏度为 71.8%,对于甲状腺癌,灵敏度为 66.6%。对于甲状腺癌,当仅使用住院诊断来定义 GePaRD 中的癌症时,灵敏度提高到 71.4%。特异性始终高于 99%。使用概率性链接来定义癌症病例时,结直肠癌的风险估计比使用确定性方法低 10 个百分点。

结论

确定性链接方法的灵敏度似乎太低,不能被视为概率性链接程序的合理替代方案。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验