Department of Statistics, Rice University, Houston, TX 77005, United States.
Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States.
Biometrics. 2024 Oct 3;80(4). doi: 10.1093/biomtc/ujae111.
In this paper, we propose Varying Effects Regression with Graph Estimation (VERGE), a novel Bayesian method for feature selection in regression. Our model has key aspects that allow it to leverage the complex structure of data sets arising from genomics or imaging studies. We distinguish between the predictors, which are the features utilized in the outcome prediction model, and the subject-level covariates, which modulate the effects of the predictors on the outcome. We construct a varying coefficients modeling framework where we infer a network among the predictor variables and utilize this network information to encourage the selection of related predictors. We employ variable selection spike-and-slab priors that enable the selection of both network-linked predictor variables and covariates that modify the predictor effects. We demonstrate through simulation studies that our method outperforms existing alternative methods in terms of both feature selection and predictive accuracy. We illustrate VERGE with an application to characterizing the influence of gut microbiome features on obesity, where we identify a set of microbial taxa and their ecological dependence relations. We allow subject-level covariates, including sex and dietary intake variables to modify the coefficients of the microbiome predictors, providing additional insight into the interplay between these factors.
在本文中,我们提出了变系数回归与图估计(VERGE),这是一种用于回归中特征选择的新的贝叶斯方法。我们的模型具有关键方面,可以利用基因组学或成像研究中出现的数据集的复杂结构。我们区分了预测器,即用于结果预测模型的特征,以及调节预测器对结果影响的主体水平协变量。我们构建了一个变系数建模框架,在该框架中,我们推断出预测变量之间的网络,并利用该网络信息来鼓励选择相关的预测器。我们采用变量选择 Spike-and-Slab 先验,以实现网络连接的预测变量和调节预测器效应的协变量的选择。通过仿真研究,我们证明了我们的方法在特征选择和预测精度方面都优于现有的替代方法。我们通过应用于表征肠道微生物组特征对肥胖的影响来说明 VERGE,我们确定了一组微生物分类群及其生态依赖关系。我们允许主体水平的协变量,包括性别和饮食摄入变量来调节微生物预测器的系数,从而深入了解这些因素之间的相互作用。