Department of Statistics, North Carolina State University, Raleigh, North Carolina, USA.
Eli Lilly and Company, Indianapolis, Indiana, USA.
Stat Med. 2022 Apr 15;41(8):1421-1445. doi: 10.1002/sim.9289. Epub 2021 Dec 26.
Unlike in randomized clinical trials (RCTs), confounding control is critical for estimating the causal effects from observational studies due to the lack of treatment randomization. Under the unconfoundedness assumption, matching methods are popular because they can be used to emulate an RCT that is hidden in the observational study. To ensure the key assumption hold, the effort is often made to collect a large number of possible confounders, rendering dimension reduction imperative in matching. Three matching schemes based on the propensity score (PSM), prognostic score (PGM), and double score (DSM, ie, the collection of the first two scores) have been proposed in the literature. However, a comprehensive comparison is lacking among the three matching schemes and has not made inroads into the best practices including variable selection, choice of caliper, and replacement. In this article, we explore the statistical and numerical properties of PSM, PGM, and DSM via extensive simulations. Our study supports that DSM performs favorably with, if not better than, the two single score matching in terms of bias and variance. In particular, DSM is doubly robust in the sense that the matching estimator is consistent requiring either the propensity score model or the prognostic score model is correctly specified. Variable selection on the propensity score model and matching with replacement is suggested for DSM, and we illustrate the recommendations with comprehensive simulation studies. An R package is available at https://github.com/Yunshu7/dsmatch.
与随机临床试验 (RCT) 不同,由于缺乏治疗随机化,混杂控制对于从观察性研究中估计因果效应至关重要。在无混杂假设下,匹配方法很受欢迎,因为它们可以用于模拟隐藏在观察性研究中的 RCT。为了确保关键假设成立,通常需要收集大量可能的混杂因素,这使得匹配中必须进行降维。文献中已经提出了基于倾向评分 (PSM)、预测评分 (PGM) 和双评分 (DSM,即前两个评分的集合) 的三种匹配方案。然而,这三种匹配方案之间缺乏全面的比较,也没有深入探讨最佳实践,包括变量选择、卡尺选择和替换。在本文中,我们通过广泛的模拟探讨了 PSM、PGM 和 DSM 的统计和数值特性。我们的研究支持,如果不是更好的话,DSM 在偏差和方差方面的表现优于两种单评分匹配。特别是,DSM 是双重稳健的,这意味着匹配估计量是一致的,只要正确指定了倾向评分模型或预测评分模型。我们建议对 DSM 进行倾向评分模型上的变量选择和有放回的匹配,并用全面的模拟研究来说明这些建议。一个 R 包可在 https://github.com/Yunshu7/dsmatch 上获得。