用于同时降秩和变量选择的贝叶斯稀疏多元回归

Bayesian sparse multiple regression for simultaneous rank reduction and variable selection.

作者信息

Chakraborty Antik, Bhattacharya Anirban, Mallick Bani K

机构信息

Department of Statistics, Texas A&M University, College Station, Texas, 77843, USA.

出版信息

Biometrika. 2020 Mar;107(1):205-221. doi: 10.1093/biomet/asz056. Epub 2019 Nov 23.

DOI:10.1093/biomet/asz056

PMID:33100350

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7584295/

Abstract

We develop a Bayesian methodology aimed at simultaneously estimating low-rank and row-sparse matrices in a high-dimensional multiple-response linear regression model. We consider a carefully devised shrinkage prior on the matrix of regression coefficients which obviates the need to specify a prior on the rank, and shrinks the regression matrix towards low-rank and row-sparse structures. We provide theoretical support to the proposed methodology by proving minimax optimality of the posterior mean under the prediction risk in ultra-high dimensional settings where the number of predictors can grow sub-exponentially relative to the sample size. A one-step post-processing scheme induced by group lasso penalties on the rows of the estimated coefficient matrix is proposed for variable selection, with default choices of tuning parameters. We additionally provide an estimate of the rank using a novel optimization function achieving dimension reduction in the covariate space. We exhibit the performance of the proposed methodology in an extensive simulation study and a real data example.

摘要

我们开发了一种贝叶斯方法，旨在同时估计高维多响应线性回归模型中的低秩和行稀疏矩阵。我们考虑在回归系数矩阵上精心设计的收缩先验，这避免了指定秩先验的需要，并将回归矩阵收缩为低秩和行稀疏结构。我们通过证明在超高维设置下后验均值在预测风险下的极小极大最优性，为所提出的方法提供理论支持，在这种设置中预测变量的数量相对于样本大小可以呈亚指数增长。针对变量选择，提出了一种由估计系数矩阵行上的组套索惩罚诱导的一步后处理方案，并给出了调优参数的默认选择。我们还使用一种新颖的优化函数提供秩估计，该函数在协变量空间中实现降维。我们在广泛的模拟研究和一个实际数据示例中展示了所提出方法的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50af/7584295/196b3a0ed9d5/nihms-1636316-f0001.jpg

相似文献

Bayesian sparse multiple regression for simultaneous rank reduction and variable selection.用于同时降秩和变量选择的贝叶斯稀疏多元回归

Biometrika. 2020 Mar;107(1):205-221. doi: 10.1093/biomet/asz056. Epub 2019 Nov 23.

Bayesian sparse reduced rank multivariate regression.贝叶斯稀疏降秩多元回归

J Multivar Anal. 2017 May;157:14-28. doi: 10.1016/j.jmva.2017.02.007. Epub 2017 Mar 4.

The Bayesian Covariance Lasso.贝叶斯协方差套索

Stat Interface. 2013 Apr 1;6(2):243-259. doi: 10.4310/sii.2013.v6.n2.a8.

VARIABLE SELECTION FOR HIGH DIMENSIONAL MULTIVARIATE OUTCOMES.高维多元结果的变量选择

Stat Sin. 2014 Oct;24(4):1633-1654. doi: 10.5705/ss.2013.019.

Consistent high-dimensional Bayesian variable selection via penalized credible regions.通过惩罚可信区域实现一致的高维贝叶斯变量选择

J Am Stat Assoc. 2012 Dec 21;107(500):1610-1624. doi: 10.1080/01621459.2012.716344. Epub 2012 Aug 14.

Scalable Bayesian Variable Selection Using Nonlocal Prior Densities in Ultrahigh-dimensional Settings.超高维情形下使用非局部先验密度的可扩展贝叶斯变量选择

Stat Sin. 2018 Apr;28(2):1053-1078. doi: 10.5705/ss.202016.0167.

Sparse Regression by Projection and Sparse Discriminant Analysis.基于投影的稀疏回归与稀疏判别分析

J Comput Graph Stat. 2015 Apr 1;24(2):416-438. doi: 10.1080/10618600.2014.907094.

Simultaneous estimation and variable selection in median regression using Lasso-type penalty.使用套索型惩罚在中位数回归中进行同时估计和变量选择。

Ann Inst Stat Math. 2010 Jun 1;62(3):487-514. doi: 10.1007/s10463-008-0184-2.

Robust reduced-rank regression.稳健降秩回归

Biometrika. 2017 Sep;104(3):633-647. doi: 10.1093/biomet/asx032. Epub 2017 Jul 12.

Sparse Group Penalties for bi-level variable selection.稀疏群组惩罚的双层变量选择。

Biom J. 2024 Jun;66(4):e2200334. doi: 10.1002/bimj.202200334.

引用本文的文献

Wheat Kernel Variety Identification Based on a Large Near-Infrared Spectral Dataset and a Novel Deep Learning-Based Feature Selection Method.基于大型近红外光谱数据集和新型深度学习特征选择方法的小麦品种鉴定

Front Plant Sci. 2020 Nov 10;11:575810. doi: 10.3389/fpls.2020.575810. eCollection 2020.

本文引用的文献

Fast sampling with Gaussian scale-mixture priors in high-dimensional regression.高维回归中具有高斯尺度混合先验的快速采样

Biometrika. 2016 Dec;103(4):985-991. doi: 10.1093/biomet/asw042. Epub 2016 Oct 27.

Efficient inference for genetic association studies with multiple outcomes.多结局遗传关联研究的高效推断

Biostatistics. 2017 Oct 1;18(4):618-636. doi: 10.1093/biostatistics/kxx007.

Reduced rank regression via adaptive nuclear norm penalization.通过自适应核范数惩罚的降秩回归。

Biometrika. 2013 Dec 4;100(4):901-920. doi: 10.1093/biomet/ast036.

Joint high-dimensional Bayesian variable and covariance selection with an application to eQTL analysis.联合高维贝叶斯变量与协方差选择及其在表达数量性状位点分析中的应用

Biometrics. 2013 Jun;69(2):447-57. doi: 10.1111/biom.12021. Epub 2013 Apr 22.

Consistent high-dimensional Bayesian variable selection via penalized credible regions.通过惩罚可信区域实现一致的高维贝叶斯变量选择

J Am Stat Assoc. 2012 Dec 21;107(500):1610-1624. doi: 10.1080/01621459.2012.716344. Epub 2012 Aug 14.

Sparse Bayesian infinite factor models.稀疏贝叶斯无限因子模型

Biometrika. 2011 Jun;98(2):291-306. doi: 10.1093/biomet/asr013.

Variational bayesian super resolution.变分贝叶斯超分辨率。

IEEE Trans Image Process. 2011 Apr;20(4):984-99. doi: 10.1109/TIP.2010.2080278. Epub 2010 Sep 27.

Sparse partial least squares regression for simultaneous dimension reduction and variable selection.用于同时进行降维和变量选择的稀疏偏最小二乘回归。

J R Stat Soc Series B Stat Methodol. 2010 Jan;72(1):3-25. doi: 10.1111/j.1467-9868.2009.00723.x.

Group SCAD regression analysis for microarray time course gene expression data.用于微阵列时间进程基因表达数据的SCAD回归分析组。

Bioinformatics. 2007 Jun 15;23(12):1486-94. doi: 10.1093/bioinformatics/btm125. Epub 2007 Apr 26.

Transcriptional regulatory networks in Saccharomyces cerevisiae.酿酒酵母中的转录调控网络。

Science. 2002 Oct 25;298(5594):799-804. doi: 10.1126/science.1075090.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验