Department of Veterans Affairs Eastern Colorado Healthcare System, Aurora, CO; Division of Hospital Medicine, University of Colorado School of Medicine, Aurora, CO; Colorado Cardiovascular Outcomes Research Consortium, Aurora, CO.
Department of Veterans Affairs Eastern Colorado Healthcare System, Aurora, CO; Department of Biostatistics and Informatics, Colorado School of Public Health, Aurora, CO.
Ann Epidemiol. 2022 Jan;65:101-108. doi: 10.1016/j.annepidem.2021.07.003. Epub 2021 Jul 17.
Purpose Machine learning is an attractive tool for identifying heterogeneous treatment effects (HTE) of interventions but generalizability of machine learning derived HTE remains unclear. We examined generalizability of HTE detected using causal forests in two similarly designed randomized trials in type II diabetes patients. Methods We evaluated published HTE of intensive versus standard glycemic control on all-cause mortality from the Action to Control Cardiovascular Risk in Diabetes study (ACCORD) in a second trial, the Veterans Affairs Diabetes Trial (VADT). We then applied causal forests to VADT, ACCORD, and pooled data from both studies and compared variable importance and subgroup effects across samples. Results HTE in ACCORD did not replicate in similar subgroups in VADT, but variable importance was correlated between VADT and ACCORD (Kendall's tau-b 0.75). Applying causal forests to pooled individual-level data yielded seven subgroups with similar HTE across both studies, ranging from risk difference of all-cause mortality of -3.9% (95% CI -7.0, -0.8) to 4.7% (95% CI 1.8, 7.5). Conclusions Machine learning detection of HTE subgroups from randomized trials may not generalize across study samples even when variable importance is correlated. Pooling individual-level data may overcome differences in study populations and/or differences in interventions that limit HTE generalizability.
目的
机器学习是一种有吸引力的工具,可用于识别干预措施的异质治疗效果(HTE),但机器学习衍生的 HTE 的可推广性尚不清楚。我们在两项设计类似的 2 型糖尿病患者随机试验中检查了因果森林检测到的 HTE 的可推广性。
方法
我们评估了 ACTION TO CONTROL CARDIOVASCULAR RISK IN DIABETES 研究(ACCORD)中强化与标准血糖控制对全因死亡率的 HTE 在第二项试验——退伍军人事务部糖尿病试验(VADT)中的重复性。然后,我们将因果森林应用于 VADT、ACCORD 以及两项研究的汇总数据,并比较了样本间的变量重要性和亚组效应。
结果
ACCORD 中的 HTE 在 VADT 中类似的亚组中没有复制,但 VADT 和 ACCORD 之间的变量重要性呈正相关(Kendall 的 tau-b 为 0.75)。将因果森林应用于汇总的个体水平数据,在两项研究中产生了七个具有相似 HTE 的亚组,全因死亡率的风险差异范围从-3.9%(95%CI-7.0,-0.8)到 4.7%(95%CI1.8,7.5)。
结论
即使变量重要性相关,从随机试验中机器学习检测到的 HTE 亚组也可能无法推广到研究样本。汇总个体水平数据可能会克服研究人群和/或干预措施的差异,这些差异限制了 HTE 的可推广性。