Suppr超能文献

关于统计学习中最小惩罚的使用

On the Use of Minimum Penalties in Statistical Learning.

作者信息

Sherwood Ben, Price Bradley S

机构信息

School of Business, University of Kansas.

Management Information Systems Department, West Virginia University.

出版信息

J Comput Graph Stat. 2024;33(1):138-151. doi: 10.1080/10618600.2023.2210174. Epub 2023 Jun 20.

Abstract

Modern multivariate machine learning and statistical methodologies estimate parameters of interest while leveraging prior knowledge of the association between outcome variables. The methods that do allow for estimation of relationships do so typically through an error covariance matrix in multivariate regression which does not generalize to other types of models. In this article we proposed the MinPen framework to simultaneously estimate regression coefficients associated with the multivariate regression model and the relationships between outcome variables using common assumptions. The MinPen framework utilizes a novel penalty based on the minimum function to simultaneously detect and exploit relationships between responses. An iterative algorithm is proposed as a solution to the non-convex optimization. Theoretical results such as high dimensional convergence rates, model selection consistency, and a framework for post selection inference are provided. We extend the proposed MinPen framework to other exponential family loss functions, with a specific focus on multiple binomial responses. Tuning parameter selection is also addressed. Finally, simulations and two data examples are presented to show the finite sample properties of this framework. Supplemental material providing proofs, additional simulations, code, and data sets are available online.

摘要

现代多变量机器学习和统计方法在利用结果变量之间关联的先验知识时,估计感兴趣的参数。那些确实允许估计关系的方法通常是通过多元回归中的误差协方差矩阵来实现的,而这种方法并不能推广到其他类型的模型。在本文中,我们提出了MinPen框架,以使用常见假设同时估计与多元回归模型相关的回归系数以及结果变量之间的关系。MinPen框架利用基于最小函数的新型惩罚来同时检测和利用响应之间的关系。提出了一种迭代算法来解决非凸优化问题。提供了诸如高维收敛率、模型选择一致性以及选择后推断框架等理论结果。我们将所提出的MinPen框架扩展到其他指数族损失函数,特别关注多个二项式响应。还讨论了调优参数的选择。最后,给出了模拟和两个数据示例,以展示该框架的有限样本性质。提供证明、额外模拟、代码和数据集的补充材料可在线获取。

相似文献

1
On the Use of Minimum Penalties in Statistical Learning.
J Comput Graph Stat. 2024;33(1):138-151. doi: 10.1080/10618600.2023.2210174. Epub 2023 Jun 20.
2
Confidence Intervals for Sparse Penalized Regression with Random Designs.
J Am Stat Assoc. 2020;115(530):794-809. doi: 10.1080/01621459.2019.1585251. Epub 2019 May 7.
3
A Path Algorithm for Constrained Estimation.
J Comput Graph Stat. 2013;22(2):261-283. doi: 10.1080/10618600.2012.681248.
5
6
Sparse Regression by Projection and Sparse Discriminant Analysis.
J Comput Graph Stat. 2015 Apr 1;24(2):416-438. doi: 10.1080/10618600.2014.907094.
7
A general framework of nonparametric feature selection in high-dimensional data.
Biometrics. 2023 Jun;79(2):951-963. doi: 10.1111/biom.13664. Epub 2022 Apr 7.
8
Elastic SCAD as a novel penalization method for SVM classification tasks in high-dimensional data.
BMC Bioinformatics. 2011 May 9;12:138. doi: 10.1186/1471-2105-12-138.
9
Low-rank regression models for multiple binary responses and their applications to cancer cell-line encyclopedia data.
J Am Stat Assoc. 2024;119(545):202-216. doi: 10.1080/01621459.2022.2105704. Epub 2022 Sep 20.
10
Designing penalty functions in high dimensional problems: The role of tuning parameters.
Electron J Stat. 2016;10(2):2312-2328. doi: 10.1214/16-EJS1169. Epub 2016 Aug 29.

本文引用的文献

1
Genome-wide prediction of DNase I hypersensitivity using gene expression.
Nat Commun. 2017 Oct 19;8(1):1038. doi: 10.1038/s41467-017-01188-x.
2
SPReM: Sparse Projection Regression Model For High-dimensional Linear Regression.
J Am Stat Assoc. 2015;110(509):289-302. doi: 10.1080/01621459.2014.892008. Epub 2015 Apr 22.
3
A significance test for graph-constrained estimation.
Biometrics. 2016 Jun;72(2):484-93. doi: 10.1111/biom.12418. Epub 2015 Sep 22.
4
Multivariate sparse group lasso for the multivariate multiple linear regression with an arbitrary group structure.
Biometrics. 2015 Jun;71(2):354-63. doi: 10.1111/biom.12292. Epub 2015 Mar 2.
5
Sparse Multivariate Regression With Covariance Estimation.
J Comput Graph Stat. 2010 Fall;19(4):947-962. doi: 10.1198/jcgs.2010.09188.
6
The Cluster Elastic Net for High-Dimensional Regression With Unknown Variable Grouping.
Technometrics. 2014 Feb 20;56(1):112-122. doi: 10.1080/00401706.2013.810174.
7
The effect of maternal prenatal smoking and alcohol consumption on the placenta-to-birth weight ratio.
Placenta. 2014 Jul;35(7):437-41. doi: 10.1016/j.placenta.2014.04.006. Epub 2014 Apr 26.
8
Simultaneous grouping pursuit and feature selection over an undirected graph.
J Am Stat Assoc. 2013 Jan 1;108(502):713-725. doi: 10.1080/01621459.2013.770704.
9
Feature Grouping and Selection Over an Undirected Graph.
KDD. 2012:922-930. doi: 10.1145/2339530.2339675.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验