Peterson Christine B, Stingo Francesco C, Vannucci Marina
Department of Health Research and Policy, Stanford University, Stanford, CA, 94305, U.S.A.
Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, U.S.A.
Stat Med. 2016 Mar 30;35(7):1017-31. doi: 10.1002/sim.6792. Epub 2015 Oct 29.
In this work, we develop a Bayesian approach to perform selection of predictors that are linked within a network. We achieve this by combining a sparse regression model relating the predictors to a response variable with a graphical model describing conditional dependencies among the predictors. The proposed method is well-suited for genomic applications because it allows the identification of pathways of functionally related genes or proteins that impact an outcome of interest. In contrast to previous approaches for network-guided variable selection, we infer the network among predictors using a Gaussian graphical model and do not assume that network information is available a priori. We demonstrate that our method outperforms existing methods in identifying network-structured predictors in simulation settings and illustrate our proposed model with an application to inference of proteins relevant to glioblastoma survival.
在这项工作中,我们开发了一种贝叶斯方法来执行网络内相关预测变量的选择。我们通过将一个将预测变量与响应变量相关联的稀疏回归模型与一个描述预测变量之间条件依赖性的图形模型相结合来实现这一点。所提出的方法非常适合基因组应用,因为它允许识别影响感兴趣结果的功能相关基因或蛋白质的途径。与以前用于网络引导变量选择的方法相比,我们使用高斯图形模型推断预测变量之间的网络,并且不假设网络信息是先验可用的。我们证明,在模拟设置中识别网络结构预测变量时,我们的方法优于现有方法,并通过应用于与胶质母细胞瘤生存相关的蛋白质推断来说明我们提出的模型。