Suppr超能文献

基于最小协方差行列式权重的分位数回归中的变量选择与正则化

Variable Selection and Regularization in Quantile Regression via Minimum Covariance Determinant Based Weights.

作者信息

Ranganai Edmore, Mudhombo Innocent

机构信息

Department of Statistics, University of South Africa, Florida Campus, Private Bag X6, Florida Park, Roodepoort 1710, South Africa.

Department of Accountancy, Vaal University of Technology, Vanderbijlpark Campus, Vanderbijlpark 1900, South Africa.

出版信息

Entropy (Basel). 2020 Dec 29;23(1):33. doi: 10.3390/e23010033.

Abstract

The importance of variable selection and regularization procedures in multiple regression analysis cannot be overemphasized. These procedures are adversely affected by predictor space data aberrations as well as outliers in the response space. To counter the latter, robust statistical procedures such as quantile regression which generalizes the well-known least absolute deviation procedure to all quantile levels have been proposed in the literature. Quantile regression is robust to response variable outliers but very susceptible to outliers in the predictor space (high leverage points) which may alter the eigen-structure of the predictor matrix. High leverage points that alter the eigen-structure of the predictor matrix by creating or hiding collinearity are referred to as collinearity influential points. In this paper, we suggest generalizing the penalized weighted least absolute deviation to all quantile levels, i.e., to penalized weighted quantile regression using the RIDGE, LASSO, and elastic net penalties as a remedy against collinearity influential points and high leverage points in general. To maintain robustness, we make use of very robust weights based on the computationally intensive high breakdown minimum covariance determinant. Simulations and applications to well-known data sets from the literature show an improvement in variable selection and regularization due to the robust weighting formulation.

摘要

在多元回归分析中,变量选择和正则化程序的重要性无论怎么强调都不为过。这些程序会受到预测变量空间数据畸变以及响应空间中的异常值的不利影响。为应对后者,文献中提出了诸如分位数回归等稳健统计程序,它将著名的最小绝对偏差程序推广到所有分位数水平。分位数回归对响应变量异常值具有稳健性,但对预测变量空间中的异常值(高杠杆点)非常敏感,这些异常值可能会改变预测矩阵的特征结构。通过创建或隐藏共线性来改变预测矩阵特征结构的高杠杆点被称为共线性影响点。在本文中,我们建议将惩罚加权最小绝对偏差推广到所有分位数水平,即使用岭回归(RIDGE)、套索回归(LASSO)和弹性网络惩罚进行惩罚加权分位数回归,作为针对共线性影响点和一般高杠杆点的一种补救措施。为保持稳健性,我们基于计算密集型的高崩溃最小协方差行列式使用非常稳健的权重。对文献中著名数据集的模拟和应用表明,由于稳健加权公式,变量选择和正则化有了改进。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f51d/7823782/762bdda5d68b/entropy-23-00033-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验