Department of Statistical Science, Duke University, Durham, North Carolina.
Stat Med. 2018 Oct 30;37(24):3533-3546. doi: 10.1002/sim.7911. Epub 2018 Aug 1.
We develop methodology for causal inference in observational studies when using propensity score subclassification on data constructed with probabilistic record linkage techniques. We focus on scenarios where covariates and binary treatment assignments are in one file and outcomes are in another file, and the goal is to estimate an additive treatment effect by merging the files. We assume that the files can be linked using variables common to both files, eg, names or birth dates, but that links are subject to errors, eg, due to reporting errors in the linking variables. We develop methodology for cases where such reporting errors are independent of the other variables on the files. We describe conceptually how linkage errors can affect causal estimates in subclassification contexts. We also present and evaluate several algorithms for deciding which record pairs to use in estimation of causal effects. Using simulation studies, we demonstrate that case selection procedures can result in improved accuracy in estimates of treatment effects from linked data compared to using only cases known to be true links.
我们开发了一种在使用概率记录链接技术构建数据时,通过倾向评分子分类进行观察性研究因果推断的方法。我们专注于协变量和二元处理分配在一个文件中,而结果在另一个文件中的情况,并且目标是通过合并文件来估计加性处理效果。我们假设可以使用两个文件共有的变量(例如姓名或出生日期)链接文件,但链接可能存在错误,例如由于链接变量的报告错误。我们开发了一种针对这种报告错误与文件上的其他变量无关的情况的方法。我们从概念上描述了链接错误如何影响子分类环境中的因果估计。我们还提出并评估了几种用于确定在因果效应估计中使用哪些记录对的算法。通过模拟研究,我们证明与仅使用已知为真实链接的情况相比,病例选择程序可以使从链接数据中估计处理效果的准确性得到提高。