Suppr超能文献

广义矩阵分解回归:双向结构化数据的估计与推断

GENERALIZED MATRIX DECOMPOSITION REGRESSION: ESTIMATION AND INFERENCE FOR TWO-WAY STRUCTURED DATA.

作者信息

Wang Yue, Shojaie Ali, Randolph Timothy, Knight Parker, Ma Jing

机构信息

Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus.

Department of Biostatistics, University of Washington.

出版信息

Ann Appl Stat. 2023 Dec;17(4):2944-2969. doi: 10.1214/23-aoas1746. Epub 2023 Oct 30.

Abstract

Motivated by emerging applications in ecology, microbiology, and neuroscience, this paper studies high-dimensional regression with two-way structured data. To estimate the high-dimensional coefficient vector, we propose the generalized matrix decomposition regression (GMDR) to efficiently leverage auxiliary information on row and column structures. GMDR extends the principal component regression (PCR) to two-way structured data, but unlike PCR, GMDR selects the components that are most predictive of the outcome, leading to more accurate prediction. For inference on regression coefficients of individual variables, we propose the generalized matrix decomposition inference (GMDI), a general high-dimensional inferential framework for a large family of estimators that include the proposed GMDR estimator. GMDI provides more flexibility for incorporating relevant auxiliary row and column structures. As a result, GMDI does not require the true regression coefficients to be sparse, but constrains the coordinate system representing the regression coefficients according to the column structure. GMDI also allows dependent and heteroscedastic observations. We study the theoretical properties of GMDI in terms of both the type-I error rate and power and demonstrate the effectiveness of GMDR and GMDI in simulation studies and an application to human microbiome data.

摘要

受生态学、微生物学和神经科学中新兴应用的推动,本文研究具有双向结构化数据的高维回归。为了估计高维系数向量,我们提出广义矩阵分解回归(GMDR),以有效利用行和列结构上的辅助信息。GMDR将主成分回归(PCR)扩展到双向结构化数据,但与PCR不同的是,GMDR选择对结果最具预测性的成分,从而实现更准确的预测。对于单个变量回归系数的推断,我们提出广义矩阵分解推断(GMDI),这是一个适用于包括所提出的GMDR估计器在内的一大类估计器的通用高维推断框架。GMDI在纳入相关辅助行和列结构方面提供了更大的灵活性。因此,GMDI不需要真实的回归系数是稀疏的,而是根据列结构约束表示回归系数的坐标系。GMDI还允许存在相依和异方差观测值。我们从一类错误率和检验功效两方面研究了GMDI的理论性质,并在模拟研究以及对人类微生物组数据的应用中证明了GMDR和GMDI的有效性。

相似文献

3
Matrix completion under complex survey sampling.复杂抽样调查下的矩阵补全
Ann Inst Stat Math. 2023 Jun;75(3):463-492. doi: 10.1007/s10463-022-00851-5. Epub 2022 Sep 19.
4
Inference with Transposable Data: Modeling the Effects of Row and Column Correlations.可转置数据的推断:对行和列相关性的影响进行建模。
J R Stat Soc Series B Stat Methodol. 2012 Sep;74(4):721-743. doi: 10.1111/j.1467-9868.2011.01027.x. Epub 2012 Mar 16.

本文引用的文献

4
The microbiome and human cancer.微生物组与人类癌症。
Science. 2021 Mar 26;371(6536). doi: 10.1126/science.abc4552.
5
A Common Space Approach to Comparative Neuroscience.比较神经科学的共同空间方法。
Annu Rev Neurosci. 2021 Jul 8;44:69-86. doi: 10.1146/annurev-neuro-100220-025942. Epub 2021 Feb 3.
8
Role of gut microbiota in type 2 diabetes pathophysiology.肠道微生物群在 2 型糖尿病发病机制中的作用。
EBioMedicine. 2020 Jan;51:102590. doi: 10.1016/j.ebiom.2019.11.051. Epub 2020 Jan 3.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验