Dong Jing, Zhang Junni L, Zeng Shuxi, Li Fan
Industrial and Commercial Bank of China, Beijing, China.
National School of Development, Center for Statistical Science and Center for Data Science, Peking University, Beijing, China.
Stat Methods Med Res. 2020 Mar;29(3):659-676. doi: 10.1177/0962280219870836. Epub 2019 Aug 28.
This paper concerns estimation of subgroup treatment effects with observational data. Existing propensity score methods are mostly developed for estimating overall treatment effect. Although the true propensity scores balance covariates in any subpopulations, the estimated propensity scores may result in severe imbalance in subgroup samples. Indeed, subgroup analysis amplifies a bias-variance tradeoff, whereby increasing complexity of the propensity score model may help to achieve covariate balance within subgroups, but it also increases variance. We propose a new method, the subgroup balancing propensity score, to ensure good subgroup balance as well as to control the variance inflation. For each subgroup, the subgroup balancing propensity score chooses to use either the overall sample or the subgroup (sub)sample to estimate the propensity scores for the units within that subgroup, in order to optimize a criterion accounting for a set of covariate-balancing moment conditions for both the overall sample and the subgroup samples. We develop two versions of subgroup balancing propensity score corresponding to matching and weighting, respectively. We devise a stochastic search algorithm to estimate the subgroup balancing propensity score when the number of subgroups is large. We demonstrate through simulations that the subgroup balancing propensity score improves the performance of propensity score methods in estimating subgroup treatment effects. We apply the subgroup balancing propensity score method to the Italy Survey of Household Income and Wealth (SHIW) to estimate the causal effects of having debit card on household consumption for different income groups.
本文关注利用观测数据估计亚组治疗效果。现有的倾向得分方法大多是为估计总体治疗效果而开发的。尽管真实的倾向得分能使任何亚群中的协变量达到平衡,但估计出的倾向得分可能会导致亚组样本出现严重失衡。事实上,亚组分析加剧了偏差 - 方差权衡,即倾向得分模型复杂度的增加可能有助于在亚组内实现协变量平衡,但同时也会增加方差。我们提出了一种新方法——亚组平衡倾向得分,以确保良好的亚组平衡并控制方差膨胀。对于每个亚组,亚组平衡倾向得分选择使用总体样本或亚组(子)样本,来估计该亚组内个体的倾向得分,以便优化一个考虑了总体样本和亚组样本的一组协变量平衡矩条件的准则。我们分别开发了对应匹配和加权的两个版本的亚组平衡倾向得分。当亚组数量很大时,我们设计了一种随机搜索算法来估计亚组平衡倾向得分。我们通过模拟证明,亚组平衡倾向得分在估计亚组治疗效果时提高了倾向得分方法的性能。我们将亚组平衡倾向得分方法应用于意大利家庭收入与财富调查(SHIW),以估计借记卡持有对不同收入群体家庭消费的因果效应。