Suppr超能文献

一种使用确定性和概率性方法相结合的混合记录链接方法。

A hybrid approach to record linkage using a combination of deterministic and probabilistic methodology.

机构信息

Department of Pediatrics, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA.

Department of Epidemiology, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA.

出版信息

J Am Med Inform Assoc. 2020 Apr 1;27(4):505-513. doi: 10.1093/jamia/ocz232.

Abstract

OBJECTIVE

The disjointed healthcare system and the nonexistence of a universal patient identifier across systems necessitates accurate record linkage (RL). We aim to describe the implementation and evaluation of a hybrid record linkage method in a statewide surveillance system for congenital heart disease.

MATERIALS AND METHODS

Clear-text personally identifiable information on individuals in the Colorado Congenital Heart Disease surveillance system was obtained from 5 electronic health record and medical claims data sources. Two deterministic methods and 1 probabilistic RL method using first name, last name, social security number, date of birth, and house number were initially implemented independently and then sequentially in a hybrid approach to assess RL performance.

RESULTS

16 480 nonunique individuals with congenital heart disease were ascertained. Deterministic linkage methods, when performed independently, yielded 4505 linked pairs (consisting of 2 records linked together within or across data sources). Probabilistic RL, using 3 initial characters of last name and gender for blocking, yielded 6294 linked pairs when executed independently. Using a hybrid linkage routine resulted in 6451 linkages and an additional 18%-24% correct linked pairs as compared to the independent methods. A hybrid linkage routine resulted in higher recall and F-measure scores compared to probabilistic and deterministic methods performed independently.

DISCUSSION

The hybrid approach resulted in increased linkage accuracy and identified pairs of linked record that would have otherwise been missed when using any independent linkage technique.

CONCLUSION

When performing RL within and across disparate data sources, the hybrid RL routine outperformed independent deterministic and probabilistic methods.

摘要

目的

不连贯的医疗保健系统和系统之间缺乏通用的患者标识符,这就需要准确的记录链接(RL)。我们旨在描述在全州先天性心脏病监测系统中实施和评估混合记录链接方法。

材料与方法

从 5 个电子健康记录和医疗索赔数据来源中获取科罗拉多先天性心脏病监测系统中个人的明文身份信息。最初独立实施了 2 种确定性方法和 1 种基于名、姓、社会安全号码、出生日期和门牌号码的概率 RL 方法,然后在混合方法中按顺序进行,以评估 RL 性能。

结果

确定了 16480 名患有先天性心脏病的非独特个体。当独立执行确定性链接方法时,产生了 4505 对链接(由在数据源内或跨数据源链接在一起的 2 条记录组成)。概率 RL 使用姓氏和性别的前 3 个字符进行阻塞,独立执行时产生了 6294 对链接。与独立方法相比,使用混合链接例程可获得 6451 个链接和额外的 18%-24%正确链接对。与独立执行的概率和确定性方法相比,混合链接例程的召回率和 F 度量得分更高。

讨论

混合方法提高了链接准确性,并识别出了使用任何独立链接技术可能会错过的链接记录对。

结论

在跨不同数据源执行 RL 时,混合 RL 例程优于独立的确定性和概率方法。

相似文献

引用本文的文献

本文引用的文献

9
Using global unique identifiers to link autism collections.使用全球唯一标识符来链接自闭症数据集。
J Am Med Inform Assoc. 2010 Nov-Dec;17(6):689-95. doi: 10.1136/jamia.2009.002063.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验