Suppr超能文献

扩展 Fellegi-Sunter 概率记录链接方法以用于近似字段比较器。

Extending the Fellegi-Sunter probabilistic record linkage method for approximate field comparators.

机构信息

Department of Biomedical Informatics, University of Utah, Utah, USA.

出版信息

J Biomed Inform. 2010 Feb;43(1):24-30. doi: 10.1016/j.jbi.2009.08.004. Epub 2009 Aug 13.

Abstract

Probabilistic record linkage is a method commonly used to determine whether demographic records refer to the same person. The Fellegi-Sunter method is a probabilistic approach that uses field weights based on log likelihood ratios to determine record similarity. This paper introduces an extension of the Fellegi-Sunter method that incorporates approximate field comparators in the calculation of field weights. The data warehouse of a large academic medical center was used as a case study. The approximate comparator extension was compared with the Fellegi-Sunter method in its ability to find duplicate records previously identified in the data warehouse using different demographic fields and matching cutoffs. The approximate comparator extension misclassified 25% fewer pairs and had a larger Welch's T statistic than the Fellegi-Sunter method for all field sets and matching cutoffs. The accuracy gain provided by the approximate comparator extension grew as less information was provided and as the matching cutoff increased. Given the ubiquity of linkage in both clinical and research settings, the incremental improvement of the extension has the potential to make a considerable impact.

摘要

概率记录链接是一种常用于确定人口统计学记录是否指的是同一个人的方法。费莱吉-桑特方法是一种概率方法,它使用基于对数似然比的字段权重来确定记录的相似性。本文介绍了一种费莱吉-桑特方法的扩展,该方法在计算字段权重时采用了近似字段比较器。以一个大型学术医疗中心的数据仓库为例进行研究。将近似比较器扩展与费莱吉-桑特方法进行比较,以确定使用不同人口统计学字段和匹配截止值在数据仓库中先前识别的重复记录。对于所有字段集和匹配截止值,近似比较器扩展错误分类的对少 25%,并且韦尔奇 T 统计量大于费莱吉-桑特方法。随着提供的信息量减少和匹配截止值增加,近似比较器扩展提供的准确性增益也在增加。鉴于链接在临床和研究环境中的普遍性,扩展的增量改进有可能产生相当大的影响。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验