1 Division of Biostatistics, University of California, USA.
2 Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Womens Hospital and Harvard Medical School, USA.
Stat Methods Med Res. 2019 Apr;28(4):1044-1063. doi: 10.1177/0962280217744588. Epub 2017 Dec 11.
Propensity score-based estimators are increasingly used for causal inference in observational studies. However, model selection for propensity score estimation in high-dimensional data has received little attention. In these settings, propensity score models have traditionally been selected based on the goodness-of-fit for the treatment mechanism itself, without consideration of the causal parameter of interest. Collaborative minimum loss-based estimation is a novel methodology for causal inference that takes into account information on the causal parameter of interest when selecting a propensity score model. This "collaborative learning" considers variable associations with both treatment and outcome when selecting a propensity score model in order to minimize a bias-variance tradeoff in the estimated treatment effect. In this study, we introduce a novel approach for collaborative model selection when using the LASSO estimator for propensity score estimation in high-dimensional covariate settings. To demonstrate the importance of selecting the propensity score model collaboratively, we designed quasi-experiments based on a real electronic healthcare database, where only the potential outcomes were manually generated, and the treatment and baseline covariates remained unchanged. Results showed that the collaborative minimum loss-based estimation algorithm outperformed other competing estimators for both point estimation and confidence interval coverage. In addition, the propensity score model selected by collaborative minimum loss-based estimation could be applied to other propensity score-based estimators, which also resulted in substantive improvement for both point estimation and confidence interval coverage. We illustrate the discussed concepts through an empirical example comparing the effects of non-selective nonsteroidal anti-inflammatory drugs with selective COX-2 inhibitors on gastrointestinal complications in a population of Medicare beneficiaries.
基于倾向评分的估计量越来越多地用于观察性研究中的因果推断。然而,在高维数据中,倾向评分估计的模型选择很少受到关注。在这些环境中,倾向评分模型传统上是基于对治疗机制本身的拟合优度来选择的,而没有考虑到感兴趣的因果参数。协同最小损失估计是一种新颖的因果推断方法,它在选择倾向评分模型时考虑了与感兴趣的因果参数相关的信息。这种“协同学习”在选择倾向评分模型时考虑了与治疗和结果的变量关联,以最小化估计治疗效果中的偏差方差权衡。在这项研究中,我们介绍了一种在高维协变量环境中使用 LASSO 估计器进行倾向评分估计时进行协同模型选择的新方法。为了证明协同选择倾向评分模型的重要性,我们基于真实的电子医疗保健数据库设计了拟实验,其中仅手动生成潜在结果,而治疗和基线协变量保持不变。结果表明,协同最小损失估计算法在点估计和置信区间覆盖方面均优于其他竞争估计器。此外,协同最小损失估计选择的倾向评分模型可以应用于其他基于倾向评分的估计器,这也导致了点估计和置信区间覆盖的实质性改进。我们通过一个比较非选择性非甾体抗炎药与选择性 COX-2 抑制剂对 Medicare 受益人群胃肠道并发症影响的实证例子来说明讨论的概念。