结合预测变量间图形结构的稀疏回归

Sparse Regression Incorporating Graphical Structure among Predictors.

作者信息

Yu Guan, Liu Yufeng

机构信息

Guan Yu is Ph.D. Candidate, Department of Statistics and Operations Research. Yufeng Liu is Professor, Department of Statistics and Operations Research, Carolina Center for Genome Science, Department of Biostatistics, University of North Carolina at Chapel Hill, NC 27599.

出版信息

J Am Stat Assoc. 2016;111(514):707-720. doi: 10.1080/01621459.2015.1034319. Epub 2016 Aug 18.

DOI:10.1080/01621459.2015.1034319

PMID:29503486

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5830184/

Abstract

With the abundance of high dimensional data in various disciplines, sparse regularized techniques are very popular these days. In this paper, we make use of the structure information among predictors to improve sparse regression models. Typically, such structure information can be modeled by the connectivity of an undirected graph using all predictors as nodes of the graph. Most existing methods use this undirected graph edge-by-edge to encourage the regression coefficients of corresponding connected predictors to be similar. However, such methods do not directly utilize the neighborhood information of the graph. Furthermore, if there are more edges in the predictor graph, the corresponding regularization term will be more complicate. In this paper, we incorporate the graph information node-by-node, instead of edge-by-edge as used in most existing methods. Our proposed method is very general and it includes adaptive Lasso, group Lasso, and ridge regression as special cases. Both theoretical and numerical studies demonstrate the effectiveness of the proposed method for simultaneous estimation, prediction and model selection.

摘要

随着各学科中高维数据的丰富，稀疏正则化技术如今非常流行。在本文中，我们利用预测变量之间的结构信息来改进稀疏回归模型。通常，这种结构信息可以通过使用所有预测变量作为图的节点的无向图的连通性来建模。大多数现有方法逐边使用此无向图来促使相应相连预测变量的回归系数相似。然而，此类方法并未直接利用图的邻域信息。此外，如果预测变量图中的边更多，相应的正则化项将更复杂。在本文中，我们逐节点合并图信息，而不是像大多数现有方法那样逐边合并。我们提出的方法非常通用，它包括自适应Lasso、分组Lasso和岭回归作为特殊情况。理论和数值研究均证明了所提方法在同时估计、预测和模型选择方面的有效性。

相似文献

Sparse Regression Incorporating Graphical Structure among Predictors.

J Am Stat Assoc. 2016;111(514):707-720. doi: 10.1080/01621459.2015.1034319. Epub 2016 Aug 18.

Regularized estimation of large-scale gene association networks using graphical Gaussian models.

BMC Bioinformatics. 2009 Nov 24;10:384. doi: 10.1186/1471-2105-10-384.

Graph-guided joint prediction of class label and clinical scores for the Alzheimer's disease.

Brain Struct Funct. 2016 Sep;221(7):3787-801. doi: 10.1007/s00429-015-1132-6. Epub 2015 Oct 17.

Graph-based sparse linear discriminant analysis for high-dimensional classification.

J Multivar Anal. 2019 May;171:250-269. doi: 10.1016/j.jmva.2018.12.007. Epub 2018 Dec 17.

Exact Covariance Thresholding into Connected Components for Large-Scale Graphical Lasso.

J Mach Learn Res. 2012 Mar 1;13:781-794.

Feature Grouping and Selection Over an Undirected Graph.

KDD. 2012:922-930. doi: 10.1145/2339530.2339675.

Tree-based Node Aggregation in Sparse Graphical Models.

J Mach Learn Res. 2022 Sep;23.

A framework of joint graph embedding and sparse regression for dimensionality reduction.

IEEE Trans Image Process. 2015 Apr;24(4):1341-55. doi: 10.1109/TIP.2015.2405474. Epub 2015 Feb 19.

Learning Graphical Models With Hubs.

J Mach Learn Res. 2014 Oct;15:3297-3331.

Estimation of High-Dimensional Graphical Models Using Regularized Score Matching.

Electron J Stat. 2016;10(1):806-854. doi: 10.1214/16-EJS1126. Epub 2016 Apr 6.

引用本文的文献

Integrative Learning of Structured High-Dimensional Data from Multiple Datasets.

Stat Anal Data Min. 2023 Apr;16(2):120-134. doi: 10.1002/sam.11601. Epub 2022 Nov 8.

Bi-level structured functional analysis for genome-wide association studies.

Biometrics. 2023 Dec;79(4):3359-3373. doi: 10.1111/biom.13871. Epub 2023 May 7.

Multi-Modal Imaging Genetics Data Fusion via a Hypergraph-Based Manifold Regularization: Application to Schizophrenia Study.

IEEE Trans Med Imaging. 2022 Sep;41(9):2263-2272. doi: 10.1109/TMI.2022.3161828. Epub 2022 Aug 31.

GRIA: Graphical Regularization for Integrative Analysis.

Proc SIAM Int Conf Data Min. 2020;2020:604-612. doi: 10.1137/1.9781611976236.68.

Graph-based sparse linear discriminant analysis for high-dimensional classification.

J Multivar Anal. 2019 May;171:250-269. doi: 10.1016/j.jmva.2018.12.007. Epub 2018 Dec 17.

Structured gene-environment interaction analysis.

Biometrics. 2020 Mar;76(1):23-35. doi: 10.1111/biom.13139. Epub 2019 Oct 9.

An imputation-regularized optimization algorithm for high dimensional missing data problems and beyond.

J R Stat Soc Series B Stat Methodol. 2018 Nov;80(5):899-926. doi: 10.1111/rssb.12279. Epub 2018 Jun 25.

Generalized Bayesian Factor Analysis for Integrative Clustering with Applications to Multi-Omics Data.

Proc Int Conf Data Sci Adv Anal. 2018 Oct;2018:109-119. doi: 10.1109/DSAA.2018.00021. Epub 2019 Feb 4.

Overlapping group screening for detection of gene-gene interactions: application to gene expression profiles with survival trait.

BMC Bioinformatics. 2018 Sep 21;19(1):335. doi: 10.1186/s12859-018-2372-2.

本文引用的文献

Molecular pathway identification using biological network-regularized logistic models.

BMC Genomics. 2013;14 Suppl 8(Suppl 8):S7. doi: 10.1186/1471-2164-14-S8-S7. Epub 2013 Dec 9.

Simultaneous grouping pursuit and feature selection over an undirected graph.

J Am Stat Assoc. 2013 Jan 1;108(502):713-725. doi: 10.1080/01621459.2013.770704.

Feature Grouping and Selection Over an Undirected Graph.

KDD. 2012:922-930. doi: 10.1145/2339530.2339675.

Network-based penalized regression with application to genomic data.

Biometrics. 2013 Sep;69(3):582-93. doi: 10.1111/biom.12035. Epub 2013 Jul 3.

Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer's disease.

Neuroimage. 2012 Jan 16;59(2):895-907. doi: 10.1016/j.neuroimage.2011.09.069. Epub 2011 Oct 4.

Penalized methods for bi-level variable selection.

Stat Interface. 2009 Jul 1;2(3):369-380. doi: 10.4310/sii.2009.v2.n3.a10.

Covariance-regularized regression and classification for high-dimensional problems.

J R Stat Soc Series B Stat Methodol. 2009 Feb 20;71(3):615-636. doi: 10.1111/j.1467-9868.2009.00699.x.

Statistical estimation of correlated genome associations to a quantitative trait network.

PLoS Genet. 2009 Aug;5(8):e1000587. doi: 10.1371/journal.pgen.1000587. Epub 2009 Aug 14.

Incorporating predictor network in penalized regression with application to microarray data.

Biometrics. 2010 Jun;66(2):474-84. doi: 10.1111/j.1541-0420.2009.01296.x. Epub 2009 Jul 23.

Baseline and longitudinal patterns of brain atrophy in MCI patients, and their use in prediction of short-term conversion to AD: results from ADNI.

Neuroimage. 2009 Feb 15;44(4):1415-22. doi: 10.1016/j.neuroimage.2008.10.031. Epub 2008 Nov 5.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

结合预测变量间图形结构的稀疏回归

Sparse Regression Incorporating Graphical Structure among Predictors.

作者信息

Yu Guan, Liu Yufeng

机构信息

出版信息

J Am Stat Assoc. 2016;111(514):707-720. doi: 10.1080/01621459.2015.1034319. Epub 2016 Aug 18.

DOI:10.1080/01621459.2015.1034319

PMID:29503486

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5830184/

Abstract

摘要

结合预测变量间图形结构的稀疏回归

Sparse Regression Incorporating Graphical Structure among Predictors.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

结合预测变量间图形结构的稀疏回归

Sparse Regression Incorporating Graphical Structure among Predictors.

作者信息

机构信息

出版信息