Hu Liangyuan, Gu Chenyang, Lopez Michael, Ji Jiayi, Wisnivesky Juan
Department of Population Health Science and Policy, Icahn School of Medicine, New York, USA.
Institute for Health Care Delivery Science, Icahn School of Medicine, New York, USA.
Stat Methods Med Res. 2020 Nov;29(11):3218-3234. doi: 10.1177/0962280220921909. Epub 2020 May 25.
There is a dearth of robust methods to estimate the causal effects of multiple treatments when the outcome is binary. This paper uses two unique sets of simulations to propose and evaluate the use of Bayesian additive regression trees in such settings. First, we compare Bayesian additive regression trees to several approaches that have been proposed for continuous outcomes, including inverse probability of treatment weighting, targeted maximum likelihood estimator, vector matching, and regression adjustment. Results suggest that under conditions of non-linearity and non-additivity of both the treatment assignment and outcome generating mechanisms, Bayesian additive regression trees, targeted maximum likelihood estimator, and inverse probability of treatment weighting using generalized boosted models provide better bias reduction and smaller root mean squared error. Bayesian additive regression trees and targeted maximum likelihood estimator provide more consistent 95% confidence interval coverage and better large-sample convergence property. Second, we supply Bayesian additive regression trees with a strategy to identify a common support region for retaining inferential units and for avoiding extrapolating over areas of the covariate space where common support does not exist. Bayesian additive regression trees retain more inferential units than the generalized propensity score-based strategy, and shows lower bias, compared to targeted maximum likelihood estimator or generalized boosted model, in a variety of scenarios differing by the degree of covariate overlap. A case study examining the effects of three surgical approaches for non-small cell lung cancer demonstrates the methods.
当结果为二元变量时,缺乏可靠的方法来估计多种治疗的因果效应。本文使用两组独特的模拟来提出并评估贝叶斯加法回归树在这种情况下的应用。首先,我们将贝叶斯加法回归树与几种针对连续结果提出的方法进行比较,包括治疗权重的逆概率法、靶向最大似然估计法、向量匹配法和回归调整法。结果表明,在治疗分配和结果生成机制均存在非线性和非加性的条件下,贝叶斯加法回归树、靶向最大似然估计法以及使用广义增强模型的治疗权重逆概率法能更好地减少偏差,且均方根误差更小。贝叶斯加法回归树和靶向最大似然估计法能提供更一致的95%置信区间覆盖范围以及更好的大样本收敛特性。其次,我们为贝叶斯加法回归树提供了一种策略,用于识别一个共同支持区域,以保留推理单元并避免在不存在共同支持的协变量空间区域进行外推。在各种因协变量重叠程度不同的场景中,贝叶斯加法回归树比基于广义倾向得分的策略保留了更多的推理单元,并且与靶向最大似然估计法或广义增强模型相比,偏差更低。一项研究三种非小细胞肺癌手术方法效果的案例研究证明了这些方法的有效性。