Inoue Kosuke, Adomi Motohiko, Efthimiou Orestis, Komura Toshiaki, Omae Kenji, Onishi Akira, Tsutsumi Yusuke, Fujii Tomoko, Kondo Naoki, Furukawa Toshi A
Department of Social Epidemiology, Graduate School of Medicine, Kyoto University, Kyoto, Japan; Hakubi Center, Kyoto University, Kyoto, Japan.
Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
J Clin Epidemiol. 2024 Dec;176:111538. doi: 10.1016/j.jclinepi.2024.111538. Epub 2024 Sep 19.
Estimating heterogeneous treatment effects (HTEs) in randomized controlled trials (RCTs) has received substantial attention recently. This has led to the development of several statistical and machine learning (ML) algorithms to assess HTEs through identifying individualized treatment effects. However, a comprehensive review of these algorithms is lacking. We thus aimed to catalog and outline currently available statistical and ML methods for identifying HTEs via effect modeling using clinical RCT data and summarize how they have been applied in practice.
We performed a scoping review using prespecified search terms in MEDLINE and Embase, aiming to identify studies that assessed HTEs using advanced statistical and ML methods in RCT data published from 2010 to 2022.
Among a total of 32 studies identified in the review, 17 studies applied existing algorithms to RCT data, and 15 extended existing algorithms or proposed new algorithms. Applied algorithms included penalized regression, causal forest, Bayesian causal forest, and other metalearner frameworks. Of these methods, causal forest was the most frequently used (7 studies) followed by Bayesian causal forest (4 studies). Most applications were in cardiology (6 studies), followed by psychiatry (4 studies). We provide example R codes in simulated data to illustrate how to implement these algorithms.
This review identified and outlined various algorithms currently used to identify HTEs and individualized treatment effects in RCT data. Given the increasing availability of new algorithms, analysts should carefully select them after examining model performance and considering how the models will be used in practice.
在随机对照试验(RCT)中估计异质性治疗效果(HTE)近来受到了广泛关注。这促使了多种统计和机器学习(ML)算法的发展,以通过识别个体治疗效果来评估HTE。然而,缺乏对这些算法的全面综述。因此,我们旨在对目前可用的通过使用临床RCT数据进行效应建模来识别HTE的统计和ML方法进行编目和概述,并总结它们在实践中的应用方式。
我们在MEDLINE和Embase中使用预先指定的搜索词进行了一项范围综述,旨在识别在2010年至2022年发表的RCT数据中使用先进统计和ML方法评估HTE的研究。
在综述中总共识别出的32项研究中,17项研究将现有算法应用于RCT数据,15项扩展了现有算法或提出了新算法。应用的算法包括惩罚回归、因果森林、贝叶斯因果森林和其他元学习器框架。在这些方法中,因果森林使用最为频繁(7项研究),其次是贝叶斯因果森林(4项研究)。大多数应用在心脏病学领域(6项研究),其次是精神病学领域(4项研究)。我们在模拟数据中提供了示例R代码,以说明如何实现这些算法。
本综述识别并概述了目前用于识别RCT数据中HTE和个体治疗效果的各种算法。鉴于新算法的可用性不断增加,分析人员在检查模型性能并考虑模型在实践中的使用方式后应仔细选择它们。