Zhou Fei, Ren Jie, Ma Shuangge, Wu Cen
Department of Statistics, Kansas State University, Manhattan, KS.
Department of Biostatistics, Indiana University School of Medicine, Indianapolis, IN.
Comput Stat Data Anal. 2023 Nov;187. doi: 10.1016/j.csda.2023.107808. Epub 2023 Jun 23.
The quantile varying coefficient (VC) model can flexibly capture dynamical patterns of regression coefficients. In addition, due to the quantile check loss function, it is robust against outliers and heavy-tailed distributions of the response variable, and can provide a more comprehensive picture of modeling via exploring the conditional quantiles of the response variable. Although extensive studies have been conducted to examine variable selection for the high-dimensional quantile varying coefficient models, the Bayesian analysis has been rarely developed. The Bayesian regularized quantile varying coefficient model has been proposed to incorporate robustness against data heterogeneity while accommodating the non-linear interactions between the effect modifier and predictors. Selecting important varying coefficients can be achieved through Bayesian variable selection. Incorporating the multivariate spike-and-slab priors further improves performance by inducing exact sparsity. The Gibbs sampler has been derived to conduct efficient posterior inference of the sparse Bayesian quantile VC model through Markov chain Monte Carlo (MCMC). The merit of the proposed model in selection and estimation accuracy over the alternatives has been systematically investigated in simulation under specific quantile levels and multiple heavy-tailed model errors. In the case study, the proposed model leads to identification of biologically sensible markers in a non-linear gene-environment interaction study using the NHS data.
分位数变系数(VC)模型能够灵活地捕捉回归系数的动态模式。此外,由于分位数检验损失函数,它对异常值和响应变量的重尾分布具有鲁棒性,并且通过探索响应变量的条件分位数能够提供更全面的建模图景。尽管已经开展了大量研究来检验高维分位数变系数模型的变量选择,但贝叶斯分析却很少被涉及。贝叶斯正则化分位数变系数模型被提出来,以在适应效应修饰变量与预测变量之间的非线性相互作用的同时,纳入针对数据异质性的鲁棒性。通过贝叶斯变量选择可以实现重要变系数的选择。纳入多元尖峰和平板先验通过诱导精确稀疏性进一步提高了性能。吉布斯采样器已被推导出来,通过马尔可夫链蒙特卡罗(MCMC)对稀疏贝叶斯分位数VC模型进行有效的后验推断。在特定分位数水平和多个重尾模型误差下的模拟中,系统地研究了所提出模型在选择和估计准确性方面相对于其他模型的优点。在案例研究中,所提出的模型在使用NHS数据的非线性基因-环境相互作用研究中,能够识别出具有生物学意义的标记物。