Shoji Takehiro, Tsuchida Jun, Yadohisa Hiroshi
Nikkei Inc., Chiyoda-ku, Tokyo, Japan.
Department of Data Science, Kyoto Women's University, Kyoto, Japan.
Stat Methods Med Res. 2025 Jan;34(1):69-84. doi: 10.1177/09622802241299410. Epub 2024 Dec 12.
When using the propensity score method to estimate the treatment effects, it is important to select the covariates to be included in the propensity score model. The inclusion of covariates unrelated to the outcome in the propensity score model led to bias and large variance in the estimator of treatment effects. Many data-driven covariate selection methods have been proposed for selecting covariates related to outcomes. However, most of them assume an average treatment effect estimation and may not be designed to estimate quantile treatment effects (QTEs), which are the effects of treatment on the quantiles of outcome distribution. In QTE estimation, we consider two relation types with the outcome as the expected value and quantile point. To achieve this, we propose a data-driven covariate selection method for propensity score models that allows for the selection of covariates related to the expected value and quantile of the outcome for QTE estimation. Assuming the quantile regression model as an outcome regression model, covariate selection was performed using a regularization method with the partial regression coefficients of the quantile regression model as weights. The proposed method was applied to artificial data and a dataset of mothers and children born in King County, Washington, to compare the performance of existing methods and QTE estimators. As a result, the proposed method performs well in the presence of covariates related to both the expected value and quantile of the outcome.
在使用倾向得分法估计治疗效果时,选择纳入倾向得分模型的协变量非常重要。在倾向得分模型中纳入与结果无关的协变量会导致治疗效果估计量出现偏差和较大方差。已经提出了许多数据驱动的协变量选择方法来选择与结果相关的协变量。然而,它们中的大多数都假设进行平均治疗效果估计,可能并非设计用于估计分位数治疗效果(QTE),即治疗对结果分布分位数的影响。在QTE估计中,我们考虑与结果的两种关系类型,即期望值和分位数点。为实现这一点,我们提出了一种用于倾向得分模型的数据驱动协变量选择方法,该方法允许选择与结果的期望值和分位数相关的协变量以进行QTE估计。假设分位数回归模型为结果回归模型,使用一种正则化方法进行协变量选择,以分位数回归模型的偏回归系数作为权重。将所提出的方法应用于人工数据以及华盛顿州金县出生的母婴数据集,以比较现有方法和QTE估计量的性能。结果表明,在所提出的方法在存在与结果的期望值和分位数都相关的协变量的情况下表现良好。