一种基于广义似然的贝叶斯方法，用于高维数据中可扩展的联合回归和协方差选择。

A generalized likelihood-based Bayesian approach for scalable joint regression and covariance selection in high dimensions.

作者信息

Samanta Srijata, Khare Kshitij, Michailidis George

机构信息

Department of Statistics, U Florida.

出版信息

Stat Comput. 2022 Jun;32(3). doi: 10.1007/s11222-022-10102-5. Epub 2022 Jun 3.

DOI:10.1007/s11222-022-10102-5

PMID:36713060

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9881595/

Abstract

The paper addresses joint sparsity selection in the regression coefficient matrix and the error precision (inverse covariance) matrix for high-dimensional multivariate regression models in the Bayesian paradigm. The selected sparsity patterns are crucial to help understand the network of relationships between the predictor and response variables, as well as the conditional relationships among the latter. While Bayesian methods have the advantage of providing natural uncertainty quantification through posterior inclusion probabilities and credible intervals, current Bayesian approaches either restrict to specific sub-classes of sparsity patterns and/or are not scalable to settings with hundreds of responses and predictors. Bayesian approaches which only focus on estimating the posterior mode are scalable, but do not generate samples from the posterior distribution for uncertainty quantification. Using a bi-convex regression based generalized likelihood and spike-and-slab priors, we develop an algorithm called Joint Regression Network Selector (JRNS) for joint regression and covariance selection which (a) can accommodate general sparsity patterns, (b) provides posterior samples for uncertainty quantification, and (c) is scalable and orders of magnitude faster than the state-of-the-art Bayesian approaches providing uncertainty quantification. We demonstrate the statistical and computational efficacy of the proposed approach on synthetic data and through the analysis of selected cancer data sets. We also establish high-dimensional posterior consistency for one of the developed algorithms.

摘要

本文探讨了贝叶斯范式下高维多元回归模型中回归系数矩阵和误差精度（逆协方差）矩阵的联合稀疏性选择问题。所选的稀疏模式对于帮助理解预测变量与响应变量之间的关系网络以及后者之间的条件关系至关重要。虽然贝叶斯方法具有通过后验包含概率和可信区间提供自然不确定性量化的优势，但当前的贝叶斯方法要么局限于特定的稀疏模式子类，要么无法扩展到具有数百个响应和预测变量的设置。仅专注于估计后验模式的贝叶斯方法具有可扩展性，但不会从后验分布生成样本以进行不确定性量化。使用基于双凸回归的广义似然和尖峰和平板先验，我们开发了一种名为联合回归网络选择器（JRNS）的算法，用于联合回归和协方差选择，该算法（a）可以适应一般的稀疏模式，（b）提供用于不确定性量化的后验样本，并且（c）具有可扩展性，比提供不确定性量化的最新贝叶斯方法快几个数量级。我们通过对合成数据的分析以及对选定癌症数据集的分析，证明了所提出方法的统计和计算有效性。我们还为所开发的算法之一建立了高维后验一致性。