Suppr超能文献

大型公共卫生数据文件的概率性关联

Probabilistic linkage of large public health data files.

作者信息

Jaro M A

机构信息

Match Ware Technologies, Inc., Silver Spring, MD 20905, USA.

出版信息

Stat Med. 1995;14(5-7):491-8. doi: 10.1002/sim.4780140510.

Abstract

Probabilistic linkage technology makes it feasible and efficient to link large public health databases in a statistically justifiable manner. The problem addressed by the methodology is that of matching two files of individual data under conditions of uncertainty. Each field is subject to error which is measured by the probability that the field agrees given a record pair matches (called the m probability) and probabilities of chance agreement of its value states (called the u probability). Fellegi and Sunter pioneered record linkage theory. Advances in methodology include use of an EM algorithm for parameter estimation, optimization of matches by means of a linear sum assignment program, and more recently, a probability model that addresses both m and u probabilities for all value states of a field. This provides a means for obtaining greater precision from non-uniformly distributed fields, without the theoretical complications arising from frequency-based matching alone. The model includes an iterative parameter estimation procedure that is more robust than pre-match estimation techniques. The methodology was originally developed and tested by the author at the U.S. Census Bureau for census undercount estimation. The more recent advances and a new generalized software system were tested and validated by linking highway crashes to Emergency Medical Service (EMS) reports and to hospital admission records for the National Highway Traffic Safety Administration (NHTSA).

摘要

概率链接技术使得以统计上合理的方式链接大型公共卫生数据库变得可行且高效。该方法所解决的问题是在不确定条件下匹配两个个人数据文件。每个字段都存在误差,该误差通过给定记录对匹配时字段一致的概率(称为m概率)及其值状态的随机一致概率(称为u概率)来衡量。费勒吉和桑特开创了记录链接理论。方法学上的进展包括使用期望最大化(EM)算法进行参数估计、通过线性和分配程序优化匹配,以及最近提出的一种针对字段所有值状态同时考虑m和u概率的概率模型。这为从不均匀分布的字段中获得更高精度提供了一种方法,而不会出现仅基于频率匹配所产生的理论复杂性。该模型包括一个迭代参数估计程序,它比匹配前的估计技术更稳健。该方法最初由作者在美国人口普查局开发并用于人口普查漏计估计测试。最近的进展以及一个新的通用软件系统通过将高速公路撞车事故与紧急医疗服务(EMS)报告以及美国国家公路交通安全管理局(NHTSA)的医院入院记录相链接进行了测试和验证。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验