Suppr超能文献

确定性记录链接与相似性函数:巴西健康数据库研究

Deterministic record linkage versus similarity functions: a study in health databases from Brazil.

作者信息

Suzuki Kátia Mitiko Firmino, Porto Filho Carlos Humberto, Cozin Luís Fernando, Pereyra Lucas Calabrez, de Azevedo Marques Paulo Mazzoncini

机构信息

School of Medicine of Ribeirao Preto (FMRP), University of Sao Paulo (USP), Brazil.

出版信息

Stud Health Technol Inform. 2013;192:562-6.

Abstract

The record linkage is a strategy that allows linking different databases of information from patient records. Adopting the deterministic method and similarity functions (Dice, Jaro, Jaro-Winkler and Levenshtein) for the integration of heterogeneous databases aimed at different levels of health care Brazilian (primary, secondary and tertiary). The sensitivity of deterministic method was 54.5% (95% CI: 50.4 to 58.5). The best result obtained with the dissent of only one variable (mother's name) was 80.6% (95% CI: 77.2 to 83.6) and the best result obtained using the similarity function Jaro-Winkler was 91.8% (95% CI: 89.4 to 93.9). The deterministic method has high specificity but sensitivity can be reduced by the existence of spellings and typing errors in the databases. Thus, the step-by-step approach where there was disagreement in at least one of the relationship variable can increase the sensitivity of the method and the use of similarity functions.

摘要

记录链接是一种允许将来自患者记录的不同信息数据库进行链接的策略。采用确定性方法和相似性函数(戴斯系数、贾罗相似度、贾罗-温克勒相似度和莱文斯坦距离)来整合针对巴西不同医疗保健层面(初级、中级和高级)的异构数据库。确定性方法的敏感度为54.5%(95%置信区间:50.4至58.5)。仅排除一个变量(母亲姓名)时获得的最佳结果为80.6%(95%置信区间:77.2至83.6),使用贾罗-温克勒相似性函数获得的最佳结果为91.8%(95%置信区间:89.4至93.9)。确定性方法具有较高的特异性,但由于数据库中存在拼写和录入错误,敏感度可能会降低。因此,在至少一个关系变量存在不一致的情况下采用逐步方法,可以提高该方法的敏感度以及相似性函数的使用效果。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验