Möttönen Jyrki, Lähderanta Tero, Salonen Janne, Sillanpää Mikko J
Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland.
Research Unit of Mathematical Sciences, University of Oulu, Oulu, Finland.
J Appl Stat. 2024 Oct 11;52(5):1081-1102. doi: 10.1080/02664763.2024.2414346. eCollection 2025.
Lasso is a popular and efficient approach to simultaneous estimation and variable selection in high-dimensional regression models. In this paper, a robust fused LAD-lasso method for multiple outcomes is presented that addresses the challenges of non-normal outcome distributions and outlying observations. Measured covariate data from space or time, or spectral bands or genomic positions often have natural correlation structure arising from measuring distance between the covariates. The proposed multi-outcome approach includes handling of such covariate blocks by a group fusion penalty, which encourages similarity between neighboring regression coefficient vectors by penalizing their differences, for example, in sequential data situation. Properties of the proposed approach are illustrated by extensive simulations using BIC-type criteria for model selection. The method is also applied to a real-life skewed data on retirement behavior with longitudinal heteroscedastic explanatory variables.
套索是高维回归模型中同时进行估计和变量选择的一种流行且有效的方法。本文提出了一种用于多结果的稳健融合最小绝对偏差套索方法,该方法解决了非正态结果分布和异常观测值的挑战。来自空间或时间、光谱带或基因组位置的测量协变量数据通常具有因测量协变量之间的距离而产生的自然相关结构。所提出的多结果方法包括通过组融合惩罚来处理此类协变量块,该惩罚通过惩罚相邻回归系数向量之间的差异(例如在顺序数据情况下)来鼓励它们之间的相似性。使用BIC类型的标准进行模型选择的广泛模拟说明了所提出方法的性质。该方法还应用于具有纵向异方差解释变量的退休行为的实际偏态数据。