Cadei Riccardo, Khoshnevis Naeem, Lee Kwonsang, Garcia Daniela Maria, Bargagli Stoffi Falco J
Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America.
Department of Computer and Communication Science, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.
J Open Source Softw. 2023;8(92). doi: 10.21105/joss.05587. Epub 2023 Dec 15.
In health and social sciences, it is critically important to identify interpretable subgroups of the study population where a treatment has notable heterogeneity in the causal effects with respect to the average treatment effect (ATE). Several approaches have already been proposed for heterogeneous treatment effect (HTE) discovery, either estimating first the conditional average treatment effect (CATE) and identifying heterogeneous subgroups in a second stage (Bargagli-Stoffi et al., 2020, 2022; Foster et al., 2011; Hahn et al., 2020), either estimating directly these subgroups in a direct data-driven procedure (Nagpal et al., 2020; Wang & Rudin, 2022). Many of these methodologies are decision tree-based methodologies. Tree-based approaches are based on efficient and easily implementable recursive mathematical programming (e.g., HTE maximization), they can be easily tweaked and adapted to different scenarios depending on the research question of interest, and they guarantee a high degree of interpretability-i.e., the degree to which a human can understand the cause of a decision (Lakkaraju et al., 2016). Despite these appealing features, single-tree heterogeneity discovery is characterized by two main limitations: instability in the identification of the subgroups and reduced exploration of the potential heterogeneity. To accommodate these shortcomings, Bargagli-Stoffi et al. (2023) proposed Causal Rule Ensemble, a new method for interpretable HTE characterization in terms of decision rules, via an extensive exploration of heterogeneity patterns by an ensemble-of-trees approach. CRE is an R package providing a flexible implementation of Causal Rule Ensemble. The package allows for multiple variants of Causal Rule Ensemble algorithm, also including different internal individual average treatment effect (IATE) estimators-i.e., AIPW (Robins et al., 1994), Causal Forest (Athey et al., 2019), Causal BART (Hill, 2011), S-Learner (Hill, 2011), T-Learner (Hansotia & Rukstales, 2002), X-Learner (Künzel et al., 2019).
在健康与社会科学领域,识别研究人群中可解释的亚组至关重要,在这些亚组中,相对于平均治疗效果(ATE),某种治疗的因果效应存在显著的异质性。针对异质性治疗效果(HTE)发现,已经提出了几种方法,要么先估计条件平均治疗效果(CATE),然后在第二阶段识别异质性亚组(巴尔加利 - 斯托菲等人,2020年、2022年;福斯特等人,2011年;哈恩等人,2020年),要么通过直接的数据驱动程序直接估计这些亚组(纳格帕尔等人,2020年;王和鲁丁,2022年)。这些方法中有许多是基于决策树的方法。基于树的方法基于高效且易于实现的递归数学规划(例如,HTE最大化),它们可以根据感兴趣的研究问题轻松调整并适用于不同场景,并且它们保证了高度的可解释性,即人类能够理解决策原因的程度(拉卡拉朱等人,2016年)。尽管具有这些吸引人的特性,但单树异质性发现存在两个主要局限性:亚组识别的不稳定性以及对潜在异质性的探索减少。为了克服这些缺点,巴尔加利 - 斯托菲等人(2023年)提出了因果规则集成,这是一种通过树集成方法广泛探索异质性模式,以决策规则形式进行可解释HTE表征的新方法。CRE是一个R包,提供了因果规则集成的灵活实现。该包允许使用因果规则集成算法的多种变体,还包括不同的内部个体平均治疗效果(IATE)估计器,即AIPW(罗宾斯等人,1994年)、因果森林(阿特等人,2019年)、因果BART(希尔,2011年)、S - 学习器(希尔,2011年)、T - 学习器(汉索蒂亚和鲁克斯塔莱斯,2002年)、X - 学习器(库恩泽尔等人,2019年)。