Cabras Stefano
Department of Statistics, Universidad Carlos III de Madrid, 28903 Madrid, Spain.
Entropy (Basel). 2020 Aug 28;22(9):948. doi: 10.3390/e22090948.
The variable selection problem in general, and specifically for the ordinary linear regression model, is considered in the setup in which the number of covariates is large enough to prevent the exploration of all possible models. In this context, Gibbs-sampling is needed to perform stochastic model exploration to estimate, for instance, the model inclusion probability. We show that under a Bayesian non-parametric prior model for analyzing Gibbs-sampling output, the usual empirical estimator is just the asymptotic version of the expected posterior inclusion probability given the simulation output from Gibbs-sampling. Other posterior conditional estimators of inclusion probabilities can also be considered as related to the latent probabilities distributions on the model space which can be sampled given the observed Gibbs-sampling output. This paper will also compare, in this large model space setup the conventional prior approach against the non-local prior approach used to define the Bayes Factors for model selection. The approach is exposed along with simulation samples and also an application of modeling the Travel and Tourism factors all over the world.
一般而言,特别是对于普通线性回归模型,变量选择问题是在协变量数量足够多以至于无法探索所有可能模型的情况下进行考虑的。在此背景下,需要使用吉布斯抽样来进行随机模型探索,例如估计模型包含概率。我们表明,在用于分析吉布斯抽样输出的贝叶斯非参数先验模型下,通常的经验估计器恰好是给定吉布斯抽样模拟输出时预期后验包含概率的渐近形式。包含概率的其他后验条件估计器也可被视为与模型空间上的潜在概率分布相关,给定观察到的吉布斯抽样输出时可以对其进行采样。本文还将在这个大模型空间设置中,将传统先验方法与用于定义模型选择贝叶斯因子的非局部先验方法进行比较。该方法与模拟样本以及对全球旅游因素进行建模的应用一起展示。