Suppr超能文献

理解过参数化单指标模型中的隐式正则化

Understanding Implicit Regularization in Over-Parameterized Single Index Model.

作者信息

Fan Jianqing, Yang Zhuoran, Yu Mengxin

机构信息

Frederick L. Moore '18 Professor of Finance, Professor of Statistics, and Professor of Operations Research and Financial Engineering at the Princeton University.

Ph.D. students at Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08544, USA.

出版信息

J Am Stat Assoc. 2023;118(544):2315-2328. doi: 10.1080/01621459.2022.2044824. Epub 2022 Mar 27.

Abstract

In this paper, we leverage over-parameterization to design regularization-free algorithms for the high-dimensional single index model and provide theoretical guarantees for the induced implicit regularization phenomenon. Specifically, we study both vector and matrix single index models where the link function is nonlinear and unknown, the signal parameter is either a sparse vector or a low-rank symmetric matrix, and the response variable can be heavy-tailed. To gain a better understanding of the role played by implicit regularization without excess technicality, we assume that the distribution of the covariates is known a priori. For both the vector and matrix settings, we construct an over-parameterized least-squares loss function by employing the score function transform and a robust truncation step designed specifically for heavy-tailed data. We propose to estimate the true parameter by applying regularization-free gradient descent to the loss function. When the initialization is close to the origin and the stepsize is sufficiently small, we prove that the obtained solution achieves minimax optimal statistical rates of convergence in both the vector and matrix cases. In addition, our experimental results support our theoretical findings and also demonstrate that our methods empirically outperform classical methods with explicit regularization in terms of both -statistical rate and variable selection consistency.

摘要

在本文中,我们利用过参数化来设计用于高维单指标模型的无正则化算法,并为诱导的隐式正则化现象提供理论保证。具体而言,我们研究向量和矩阵单指标模型,其中链接函数是非线性且未知的,信号参数要么是稀疏向量,要么是低秩对称矩阵,并且响应变量可能是重尾的。为了在不过度讲究技术细节的情况下更好地理解隐式正则化所起的作用,我们假设协变量的分布是先验已知的。对于向量和矩阵设置,我们通过采用得分函数变换和专门为重尾数据设计的稳健截断步骤来构造一个过参数化的最小二乘损失函数。我们建议通过对损失函数应用无正则化梯度下降来估计真实参数。当初始化接近原点且步长足够小时,我们证明在向量和矩阵情况下,所得到的解都能达到极小极大最优统计收敛速率。此外,我们的实验结果支持了我们的理论发现,并且还表明我们的方法在统计速率和变量选择一致性方面在经验上优于具有显式正则化的经典方法。

相似文献

1
Understanding Implicit Regularization in Over-Parameterized Single Index Model.
J Am Stat Assoc. 2023;118(544):2315-2328. doi: 10.1080/01621459.2022.2044824. Epub 2022 Mar 27.
2
Stochastic Mirror Descent on Overparameterized Nonlinear Models.
IEEE Trans Neural Netw Learn Syst. 2022 Dec;33(12):7717-7727. doi: 10.1109/TNNLS.2021.3087480. Epub 2022 Nov 30.
4
Regularization Methods Based on the -Likelihood for Linear Models with Heavy-Tailed Errors.
Entropy (Basel). 2020 Sep 16;22(9):1036. doi: 10.3390/e22091036.
5
Another look at statistical learning theory and regularization.
Neural Netw. 2009 Sep;22(7):958-69. doi: 10.1016/j.neunet.2009.04.005. Epub 2009 Apr 22.
6
Gradient Learning With the Mode-Induced Loss: Consistency Analysis and Applications.
IEEE Trans Neural Netw Learn Syst. 2024 Jul;35(7):9686-9699. doi: 10.1109/TNNLS.2023.3236345. Epub 2024 Jul 8.
7
Minimax Rates of -Losses for High-Dimensional Linear Errors-in-Variables Models over -Balls.
Entropy (Basel). 2021 Jun 5;23(6):722. doi: 10.3390/e23060722.
8
Nonlinear Feature Selection Neural Network via Structured Sparse Regularization.
IEEE Trans Neural Netw Learn Syst. 2023 Nov;34(11):9493-9505. doi: 10.1109/TNNLS.2022.3209716. Epub 2023 Oct 27.
9
Nonconvex Sparse Regularization for Deep Neural Networks and Its Optimality.
Neural Comput. 2022 Jan 14;34(2):476-517. doi: 10.1162/neco_a_01457.

引用本文的文献

1
Sufficient dimension reduction on partially nonlinear index models with applications to medical costs analysis.
PLoS One. 2025 May 13;20(5):e0321796. doi: 10.1371/journal.pone.0321796. eCollection 2025.
2
Are Latent Factor Regression and Sparse Regression Adequate?
J Am Stat Assoc. 2024;119(546):1076-1088. doi: 10.1080/01621459.2023.2169700. Epub 2023 Feb 14.

本文引用的文献

1
A SHRINKAGE PRINCIPLE FOR HEAVY-TAILED DATA: HIGH-DIMENSIONAL ROBUST LOW-RANK MATRIX RECOVERY.
Ann Stat. 2021 Jun;49(3):1239-1266. doi: 10.1214/20-aos1980. Epub 2021 Aug 9.
2
Robust high dimensional factor models with applications to statistical machine learning.
Stat Sci. 2021 May;36(2):303-327. doi: 10.1214/20-sts785. Epub 2021 Apr 19.
3
A selective overview of deep learning.
Stat Sci. 2021 May;36(2):264-290. doi: 10.1214/20-sts783. Epub 2020 Apr 19.
4
A Survey of the Usages of Deep Learning for Natural Language Processing.
IEEE Trans Neural Netw Learn Syst. 2021 Feb;32(2):604-624. doi: 10.1109/TNNLS.2020.2979670. Epub 2021 Feb 4.
5
Robust Covariance Estimation for Approximate Factor Models.
J Econom. 2019 Jan;208(1):5-22. doi: 10.1016/j.jeconom.2018.09.003. Epub 2018 Oct 6.
6
LARGE COVARIANCE ESTIMATION THROUGH ELLIPTICAL FACTOR MODELS.
Ann Stat. 2018 Aug;46(4):1383-1414. doi: 10.1214/17-AOS1588. Epub 2018 Jun 27.
7
Deep Learning for Computer Vision: A Brief Review.
Comput Intell Neurosci. 2018 Feb 1;2018:7068349. doi: 10.1155/2018/7068349. eCollection 2018.
9
Deep learning.
Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.
10
STRONG ORACLE OPTIMALITY OF FOLDED CONCAVE PENALIZED ESTIMATION.
Ann Stat. 2014 Jun;42(3):819-849. doi: 10.1214/13-aos1198.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验