Zeldow Bret, Lo Re Vincent, Roy Jason
Department of Health Care Policy, Harvard Medical School, 180 Longwood Ave, Boston, Massachusetts 02115, USA.
Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.
Ann Appl Stat. 2019 Sep;13(3):1989-2010. doi: 10.1214/19-AOAS1266. Epub 2019 Oct 17.
Bayesian Additive Regression Trees (BART) is a flexible machine learning algorithm capable of capturing nonlinearities between an outcome and covariates and interactions among covariates. We extend BART to a semiparametric regression framework in which the conditional expectation of an outcome is a function of treatment, its effect modifiers, and confounders. The confounders are allowed to have unspecified functional form, while treatment and effect modifiers that are directly related to the research question are given a linear form. The result is a Bayesian semiparametric linear regression model where the posterior distribution of the parameters of the linear part can be interpreted as in parametric Bayesian regression. This is useful in situations where a subset of the variables are of substantive interest and the others are nuisance variables that we would like to control for. An example of this occurs in causal modeling with the structural mean model (SMM). Under certain causal assumptions, our method can be used as a Bayesian SMM. Our methods are demonstrated with simulation studies and an application to dataset involving adults with HIV/Hepatitis C coinfection who newly initiate antiretroviral therapy. The methods are available in an R package called semibart.
贝叶斯加法回归树(BART)是一种灵活的机器学习算法,能够捕捉结果与协变量之间的非线性关系以及协变量之间的相互作用。我们将BART扩展到一个半参数回归框架,其中结果的条件期望是治疗、其效应修饰因子和混杂因素的函数。允许混杂因素具有未指定的函数形式,而与研究问题直接相关的治疗和效应修饰因子采用线性形式。结果是一个贝叶斯半参数线性回归模型,其中线性部分参数的后验分布可以像在参数贝叶斯回归中那样进行解释。这在某些变量子集具有实质意义而其他变量是我们想要控制的干扰变量的情况下很有用。在结构均值模型(SMM)的因果建模中就会出现这种情况。在某些因果假设下,我们的方法可以用作贝叶斯SMM。我们通过模拟研究和对涉及新开始抗逆转录病毒治疗的艾滋病毒/丙型肝炎合并感染成人数据集的应用来展示我们的方法。这些方法可以在一个名为semibart的R包中获取。