序贯协同稀疏因子回归

Sequential Co-Sparse Factor Regression.

作者信息

Mishra Aditya, Dey Dipak K, Chen Kun

机构信息

Department of Statistics, University of Connecticut.

出版信息

J Comput Graph Stat. 2017;26(4):814-825. doi: 10.1080/10618600.2017.1340891. Epub 2017 Oct 16.

DOI:10.1080/10618600.2017.1340891

PMID:30337797

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6190918/

Abstract

In multivariate regression models, a sparse singular value decomposition of the regression component matrix is appealing for reducing dimensionality and facilitating interpretation. However, the recovery of such a decomposition remains very challenging, largely due to the simultaneous presence of orthogonality constraints and co-sparsity regularization. By delving into the underlying statistical data generation mechanism, we reformulate the problem as a supervised co-sparse factor analysis, and develop an efficient computational procedure, named sequential factor extraction via co-sparse unit-rank estimation (SeCURE), that completely bypasses the orthogonality requirements. At each step, the problem reduces to a sparse multivariate regression with a unit-rank constraint. Nicely, each sequentially extracted sparse and unit-rank coefficient matrix automatically leads to co-sparsity in its pair of singular vectors. Each latent factor is thus a sparse linear combination of the predictors and may influence only a subset of responses. The proposed algorithm is guaranteed to converge, and it ensures efficient computation even with incomplete data and/or when enforcing exact orthogonality is desired. Our estimators enjoy the oracle properties asymptotically; a non-asymptotic error bound further reveals some interesting finite-sample behaviors of the estimators. The efficacy of SeCURE is demonstrated by simulation studies and two applications in genetics.

摘要

在多元回归模型中，回归分量矩阵的稀疏奇异值分解对于降维和便于解释很有吸引力。然而，这种分解的恢复仍然非常具有挑战性，主要是由于正交性约束和共同稀疏正则化同时存在。通过深入研究潜在的统计数据生成机制，我们将该问题重新表述为监督共同稀疏因子分析，并开发了一种高效的计算程序，称为通过共同稀疏单位秩估计进行顺序因子提取（SeCURE），它完全绕过了正交性要求。在每一步，问题都简化为具有单位秩约束的稀疏多元回归。很好的是，每个顺序提取的稀疏且单位秩的系数矩阵会自动在其奇异向量对中产生共同稀疏性。因此，每个潜在因子都是预测变量的稀疏线性组合，并且可能仅影响响应的一个子集。所提出的算法保证收敛，并且即使在数据不完整和/或需要强制精确正交性的情况下，它也能确保高效计算。我们的估计量渐近地具有神谕性质；一个非渐近误差界进一步揭示了估计量的一些有趣的有限样本行为。通过模拟研究和遗传学中的两个应用证明了SeCURE的有效性。

相似文献

Sequential Co-Sparse Factor Regression.序贯协同稀疏因子回归

J Comput Graph Stat. 2017;26(4):814-825. doi: 10.1080/10618600.2017.1340891. Epub 2017 Oct 16.

Robust reduced-rank regression.稳健降秩回归

Biometrika. 2017 Sep;104(3):633-647. doi: 10.1093/biomet/asx032. Epub 2017 Jul 12.

SOFAR: Large-Scale Association Network Learning.声呐：大规模关联网络学习。

IEEE Trans Inf Theory. 2019 Aug;65(8):4924-4939. doi: 10.1109/tit.2019.2909889. Epub 2019 Apr 11.

Biclustering via sparse singular value decomposition.基于稀疏奇异值分解的双聚类

Biometrics. 2010 Dec;66(4):1087-95. doi: 10.1111/j.1541-0420.2010.01392.x.

Bayesian sparse reduced rank multivariate regression.贝叶斯稀疏降秩多元回归

J Multivar Anal. 2017 May;157:14-28. doi: 10.1016/j.jmva.2017.02.007. Epub 2017 Mar 4.

A note on rank reduction in sparse multivariate regression.关于稀疏多元回归中秩降低的一则注释。

J Stat Theory Pract. 2016;10(1):100-120. doi: 10.1080/15598608.2015.1081573. Epub 2015 Aug 18.

A constrained singular value decomposition method that integrates sparsity and orthogonality.一种集成稀疏性和正交性的约束奇异值分解方法。

PLoS One. 2019 Mar 13;14(3):e0211463. doi: 10.1371/journal.pone.0211463. eCollection 2019.

Integrative multi-view regression: Bridging group-sparse and low-rank models.整合多视角回归：连接组稀疏和低秩模型

Biometrics. 2019 Jun;75(2):593-602. doi: 10.1111/biom.13006. Epub 2019 Mar 29.

Negative binomial factor regression with application to microbiome data analysis.负二项因子回归及其在微生物组数据分析中的应用。

Stat Med. 2022 Jul 10;41(15):2786-2803. doi: 10.1002/sim.9384. Epub 2022 Apr 24.

svt: Singular Value Thresholding in MATLAB.svt：MATLAB中的奇异值阈值处理

J Stat Softw. 2017;81(2). doi: 10.18637/jss.v081.c02. Epub 2017 Nov 8.

引用本文的文献

TARO: tree-aggregated factor regression for microbiome data integration.TARO：用于微生物组数据集成的树聚合因子回归。

Bioinformatics. 2024 Jun 3;40(6). doi: 10.1093/bioinformatics/btae321.

Negative binomial factor regression with application to microbiome data analysis.负二项因子回归及其在微生物组数据分析中的应用。

Stat Med. 2022 Jul 10;41(15):2786-2803. doi: 10.1002/sim.9384. Epub 2022 Apr 24.

A regression framework to uncover pleiotropy in large-scale electronic health record data.一种在大规模电子健康记录数据中揭示多效性的回归框架。

J Am Med Inform Assoc. 2019 Oct 1;26(10):1083-1090. doi: 10.1093/jamia/ocz084.

本文引用的文献

Reduced rank regression via adaptive nuclear norm penalization.通过自适应核范数惩罚的降秩回归。

Biometrika. 2013 Dec 4;100(4):901-920. doi: 10.1093/biomet/ast036.

Regularized matrix regression.正则化矩阵回归

J R Stat Soc Series B Stat Methodol. 2014 Mar 1;76(2):463-483. doi: 10.1111/rssb.12031.

Reduced Rank Ridge Regression and Its Kernel Extensions.降秩岭回归及其核扩展

Stat Anal Data Min. 2011 Dec;4(6):612-622. doi: 10.1002/sam.10138. Epub 2011 Oct 7.

ON THE ADAPTIVE ELASTIC-NET WITH A DIVERGING NUMBER OF PARAMETERS.关于具有发散参数数量的自适应弹性网络

Ann Stat. 2009;37(4):1733-1751. doi: 10.1214/08-AOS625.

Sparse partial least squares regression for simultaneous dimension reduction and variable selection.用于同时进行降维和变量选择的稀疏偏最小二乘回归。

J R Stat Soc Series B Stat Methodol. 2010 Jan;72(1):3-25. doi: 10.1111/j.1467-9868.2009.00723.x.

Group SCAD regression analysis for microarray time course gene expression data.用于微阵列时间进程基因表达数据的SCAD回归分析组。

Bioinformatics. 2007 Jun 15;23(12):1486-94. doi: 10.1093/bioinformatics/btm125. Epub 2007 Apr 26.

Transcriptional regulatory networks in Saccharomyces cerevisiae.酿酒酵母中的转录调控网络。

Science. 2002 Oct 25;298(5594):799-804. doi: 10.1126/science.1075090.

The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma.利用分子谱分析预测弥漫性大B细胞淋巴瘤化疗后的生存率。

N Engl J Med. 2002 Jun 20;346(25):1937-47. doi: 10.1056/NEJMoa012914.

Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization.通过微阵列杂交全面鉴定酿酒酵母细胞周期调控基因。

Mol Biol Cell. 1998 Dec;9(12):3273-97. doi: 10.1091/mbc.9.12.3273.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。