Suppr超能文献

非稀疏高维线性回归的标准阈值法

CANONICAL THRESHOLDING FOR NON-SPARSE HIGH-DIMENSIONAL LINEAR REGRESSION.

作者信息

Silin Igor, Fan Jianqing

机构信息

Princeton University.

出版信息

Ann Stat. 2022 Feb;50(1):460-486. doi: 10.1214/21-aos2116. Epub 2022 Feb 16.

Abstract

We consider a high-dimensional linear regression problem. Unlike many papers on the topic, we do not require sparsity of the regression coefficients; instead, our main structural assumption is a decay of eigenvalues of the covariance matrix of the data. We propose a new family of estimators, called the canonical thresholding estimators, which pick largest regression coefficients in the canonical form. The estimators admit an explicit form and can be linked to LASSO and Principal Component Regression (PCR). A theoretical analysis for both fixed design and random design settings is provided. Obtained bounds on the mean squared error and the prediction error of a specific estimator from the family allow to clearly state sufficient conditions on the decay of eigenvalues to ensure convergence. In addition, we promote the use of the relative errors, strongly linked with the out-of-sample . The study of these relative errors leads to a new concept of joint effective dimension, which incorporates the covariance of the data and the regression coefficients simultaneously, and describes the complexity of a linear regression problem. Some minimax lower bounds are established to showcase the optimality of our procedure. Numerical simulations confirm good performance of the proposed estimators compared to the previously developed methods.

摘要

我们考虑一个高维线性回归问题。与许多关于该主题的论文不同,我们不要求回归系数具有稀疏性;相反,我们的主要结构假设是数据协方差矩阵的特征值衰减。我们提出了一类新的估计器,称为规范阈值估计器,它选择规范形式下最大的回归系数。这些估计器具有显式形式,并且可以与套索回归(LASSO)和主成分回归(PCR)联系起来。我们提供了固定设计和随机设计设置下的理论分析。从该类中获得的特定估计器的均方误差和预测误差的界,使得能够清晰地陈述特征值衰减的充分条件以确保收敛。此外,我们提倡使用与样本外误差紧密相关的相对误差。对这些相对误差的研究引出了联合有效维数的新概念,它同时纳入了数据的协方差和回归系数,并描述了线性回归问题的复杂性。我们建立了一些极小极大下界以展示我们方法的最优性。数值模拟证实了与先前开发的方法相比,所提出的估计器具有良好的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d95e/9491498/afeb1fae8149/nihms-1782574-f0001.jpg

相似文献

5
Shrinkage estimators for covariance matrices.协方差矩阵的收缩估计量。
Biometrics. 2001 Dec;57(4):1173-84. doi: 10.1111/j.0006-341x.2001.01173.x.
9
Convex Banding of the Covariance Matrix.协方差矩阵的凸带形
J Am Stat Assoc. 2016;111(514):834-845. doi: 10.1080/01621459.2015.1058265. Epub 2016 Aug 18.

本文引用的文献

2
Benign overfitting in linear regression.线性回归中的良性过拟合。
Proc Natl Acad Sci U S A. 2020 Dec 1;117(48):30063-30070. doi: 10.1073/pnas.1907378117. Epub 2020 Apr 24.
3
Factor-Adjusted Regularized Model Selection.因子调整正则化模型选择
J Econom. 2020 May;216(1):71-85. doi: 10.1016/j.jeconom.2020.01.006. Epub 2020 Feb 7.
4
Sufficient Forecasting Using Factor Models.使用因子模型进行充分预测。
J Econom. 2017 Dec;201(2):292-306. doi: 10.1016/j.jeconom.2017.08.009. Epub 2017 Aug 26.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验