Department of Statistics, Florida State University, Tallahassee, Florida.
Division of General Internal Medicine, Brigham and Women's Hospital, Boston, Massachusetts.
Biometrics. 2020 Mar;76(1):131-144. doi: 10.1111/biom.13107. Epub 2019 Nov 1.
This paper demonstrates the advantages of sharing information about unknown features of covariates across multiple model components in various nonparametric regression problems including multivariate, heteroscedastic, and semicontinuous responses. In this paper, we present a methodology which allows for information to be shared nonparametrically across various model components using Bayesian sum-of-tree models. Our simulation results demonstrate that sharing of information across related model components is often very beneficial, particularly in sparse high-dimensional problems in which variable selection must be conducted. We illustrate our methodology by analyzing medical expenditure data from the Medical Expenditure Panel Survey (MEPS). To facilitate the Bayesian nonparametric regression analysis, we develop two novel models for analyzing the MEPS data using Bayesian additive regression trees-a heteroskedastic log-normal hurdle model with a "shrink-toward-homoskedasticity" prior and a gamma hurdle model.
本文展示了在各种非参数回归问题中,包括多元、异方差和半连续响应,在多个模型组件之间共享关于协变量未知特征的信息的优势。在本文中,我们提出了一种使用贝叶斯树模型和的方法,允许在各种模型组件之间进行非参数信息共享。我们的模拟结果表明,在相关模型组件之间共享信息通常非常有益,特别是在必须进行变量选择的稀疏高维问题中。我们通过分析来自医疗支出调查(MEPS)的医疗支出数据来说明我们的方法。为了促进贝叶斯非参数回归分析,我们开发了两种使用贝叶斯加法回归树分析 MEPS 数据的新模型:带有“向同方差收缩”先验的异方差对数正态障碍模型和伽马障碍模型。