Institut für Statistik, Ludwig-Maximilians-Universität München, Munich, Germany.
Munich Center for Machine Learning (MCML), Germany.
Stat Methods Med Res. 2024 Mar;33(3):392-413. doi: 10.1177/09622802231224628. Epub 2024 Feb 8.
The estimation of heterogeneous treatment effects has attracted considerable interest in many disciplines, most prominently in medicine and economics. Contemporary research has so far primarily focused on continuous and binary responses where heterogeneous treatment effects are traditionally estimated by a linear model, which allows the estimation of constant or heterogeneous effects even under certain model misspecifications. More complex models for survival, count, or ordinal outcomes require stricter assumptions to reliably estimate the treatment effect. Most importantly, the noncollapsibility issue necessitates the joint estimation of treatment and prognostic effects. Model-based forests allow simultaneous estimation of covariate-dependent treatment and prognostic effects, but only for randomized trials. In this paper, we propose modifications to model-based forests to address the confounding issue in observational data. In particular, we evaluate an orthogonalization strategy originally proposed by Robinson (1988, Econometrica) in the context of model-based forests targeting heterogeneous treatment effect estimation in generalized linear models and transformation models. We found that this strategy reduces confounding effects in a simulated study with various outcome distributions. We demonstrate the practical aspects of heterogeneous treatment effect estimation for survival and ordinal outcomes by an assessment of the potentially heterogeneous effect of Riluzole on the progress of Amyotrophic Lateral Sclerosis.
在许多学科中,特别是在医学和经济学中,异质处理效应的估计引起了相当大的兴趣。目前的研究主要集中在连续和二项反应上,传统上通过线性模型来估计异质处理效应,该模型允许在某些模型失拟的情况下估计常数或异质效应。对于生存、计数或有序结果的更复杂模型,需要更严格的假设才能可靠地估计处理效果。最重要的是,非 collapsible 问题需要联合估计治疗效果和预后效果。基于模型的森林允许同时估计依赖协变量的治疗和预后效果,但仅适用于随机试验。在本文中,我们提出了对基于模型的森林的修改,以解决观察数据中的混杂问题。具体来说,我们评估了 Robinson(1988,Econometrica)提出的一种正交化策略,该策略针对广义线性模型和转换模型中的异质处理效应估计,目标是基于模型的森林。我们发现,这种策略在具有各种结果分布的模拟研究中减少了混杂效应。我们通过评估利鲁唑对肌萎缩侧索硬化症进展的潜在异质效应,演示了生存和有序结果的异质处理效应估计的实际方面。