Kamat Gauri, Shan Mingyang, Gutman Roee
Department of Biostatistics, Brown University, Providence, Rhode Island, USA.
Eli Lilly and Company, Indianapolis, Indiana, USA.
Stat Med. 2023 Nov 30;42(27):4931-4951. doi: 10.1002/sim.9894. Epub 2023 Aug 31.
In many healthcare and social science applications, information about units is dispersed across multiple data files. Linking records across files is necessary to estimate the associations of interest. Common record linkage algorithms only rely on similarities between linking variables that appear in all the files. Moreover, analysis of linked files often ignores errors that may arise from incorrect or missed links. Bayesian record linking methods allow for natural propagation of linkage error, by jointly sampling the linkage structure and the model parameters. We extend an existing Bayesian record linkage method to integrate associations between variables exclusive to each file being linked. We show analytically, and using simulations, that the proposed method can improve the linking process, and can result in accurate inferences. We apply the method to link Meals on Wheels recipients to Medicare enrollment records.
在许多医疗保健和社会科学应用中,有关单位的信息分散在多个数据文件中。跨文件链接记录对于估计感兴趣的关联是必要的。常见的记录链接算法仅依赖于所有文件中出现的链接变量之间的相似性。此外,对链接文件的分析通常会忽略可能因错误或遗漏链接而产生的误差。贝叶斯记录链接方法通过联合对链接结构和模型参数进行采样,允许链接误差的自然传播。我们扩展了现有的贝叶斯记录链接方法,以整合被链接的每个文件所特有的变量之间的关联。我们通过分析和模拟表明,所提出的方法可以改进链接过程,并能得出准确的推断。我们应用该方法将上门送餐服务的接受者与医疗保险注册记录进行链接。