Reich Brian J, Storlie Curtis B, Bondell Howard D
Department of Statistics, North Carolina State University, 2501 Founders Drive, Box 8203, Raleigh, NC 27695, U.S.A.
Technometrics. 2009 May 1;51(2):110-120. doi: 10.1198/TECH.2009.0013.
With many predictors, choosing an appropriate subset of the covariates is a crucial, and difficult, step in nonparametric regression. We propose a Bayesian nonparametric regression model for curve-fitting and variable selection. We use the smoothing spline ANOVA framework to decompose the regression function into interpretable main effect and interaction functions. Stochastic search variable selection via MCMC sampling is used to search for models that fit the data well. Also, we show that variable selection is highly-sensitive to hyperparameter choice and develop a technique to select hyperparameters that control the long-run false positive rate. The method is used to build an emulator for a complex computer model for two-phase fluid flow.
对于许多预测变量而言,在非参数回归中选择协变量的合适子集是关键且困难的一步。我们提出了一种用于曲线拟合和变量选择的贝叶斯非参数回归模型。我们使用平滑样条方差分析框架将回归函数分解为可解释的主效应和交互作用函数。通过马尔可夫链蒙特卡罗采样进行随机搜索变量选择,以寻找能很好拟合数据的模型。此外,我们表明变量选择对超参数选择高度敏感,并开发了一种选择超参数的技术,以控制长期误报率。该方法用于为两相流体流动的复杂计算机模型构建模拟器。