Boonstra Philip S, Mukherjee Bhramar, Taylor Jeremy M G
Department of Biostatistics, University of Michigan, Ann Arbor 48109.
Stat Sin. 2015 Jul 1;25(3):1185-1206. doi: 10.5705/ss.2013.284.
We propose new approaches for choosing the shrinkage parameter in ridge regression, a penalized likelihood method for regularizing linear regression coefficients, when the number of observations is small relative to the number of parameters. Existing methods may lead to extreme choices of this parameter, which will either not shrink the coefficients enough or shrink them by too much. Within this "small-, large-" context, we suggest a correction to the common generalized cross-validation (GCV) method that preserves the asymptotic optimality of the original GCV. We also introduce the notion of a "hyperpenalty", which shrinks the shrinkage parameter itself, and make a specific recommendation regarding the choice of hyperpenalty that empirically works well in a broad range of scenarios. A simple algorithm jointly estimates the shrinkage parameter and regression coefficients in the hyperpenalized likelihood. In a comprehensive simulation study of small-sample scenarios, our proposed approaches offer superior prediction over nine other existing methods.
当观测值数量相对于参数数量较少时,我们提出了在岭回归中选择收缩参数的新方法,岭回归是一种用于正则化线性回归系数的惩罚似然方法。现有方法可能会导致该参数的极端选择,即要么对系数收缩不足,要么收缩过度。在这种“小样本、大参数”的背景下,我们建议对常见的广义交叉验证(GCV)方法进行修正,以保留原始GCV的渐近最优性。我们还引入了“超惩罚”的概念,它会对收缩参数本身进行收缩,并针对超惩罚的选择给出了具体建议,该建议在广泛的场景中经实证检验效果良好。一种简单的算法联合估计超惩罚似然中的收缩参数和回归系数。在对小样本场景的全面模拟研究中,我们提出的方法比其他九种现有方法具有更好的预测效果。