用无偏性换取方差缩减：核回归中用于稳健模型选择的正则化子空间信息准则

Sugiyama Masashi, Kawanabe Motoaki, Müller Klaus-Robert

Fraunhofer FIRST, IDA, 12489 Berlin, Germany.

Neural Comput. 2004 May;16(5):1077-104. doi: 10.1162/089976604773135113.

A well-known result by Stein (1956) shows that in particular situations, biased estimators can yield better parameter estimates than their generally preferred unbiased counterparts. This letter follows the same spirit, as we will stabilize the unbiased generalization error estimates by regularization and finally obtain more robust model selection criteria for learning. We trade a small bias against a larger variance reduction, which has the beneficial effect of being more precise on a single training set. We focus on the subspace information criterion (SIC), which is an unbiased estimator of the expected generalization error measured by the reproducing kernel Hilbert space norm. SIC can be applied to the kernel regression, and it was shown in earlier experiments that a small regularization of SIC has a stabilization effect. However, it remained open how to appropriately determine the degree of regularization in SIC. In this article, we derive an unbiased estimator of the expected squared error, between SIC and the expected generalization error and propose determining the degree of regularization of SIC such that the estimator of the expected squared error is minimized. Computer simulations with artificial and real data sets illustrate that the proposed method works effectively for improving the precision of SIC, especially in the high-noise-level cases. We furthermore compare the proposed method to the original SIC, the cross-validation, and an empirical Bayesian method in ridge parameter selection, with good results.

斯坦因（1956年）的一个著名结果表明，在特定情况下，有偏估计量可能会比通常更受青睐的无偏估计量产生更好的参数估计。这封信秉承了同样的精神，因为我们将通过正则化来稳定无偏泛化误差估计，最终获得更稳健的学习模型选择标准。我们用一个小偏差换取更大的方差减少，这在单个训练集上具有更精确的有益效果。我们关注子空间信息准则（SIC），它是由再生核希尔伯特空间范数衡量的期望泛化误差的无偏估计量。SIC可应用于核回归，早期实验表明对SIC进行小的正则化具有稳定作用。然而，如何在SIC中适当地确定正则化程度仍然是一个未解决的问题。在本文中，我们推导了SIC与期望泛化误差之间期望平方误差的无偏估计量，并提出确定SIC的正则化程度，以使期望平方误差的估计量最小化。使用人工和真实数据集的计算机模拟表明，所提出的方法有效地提高了SIC的精度，特别是在高噪声水平的情况下。我们还在岭参数选择中将所提出的方法与原始SIC、交叉验证和经验贝叶斯方法进行了比较，结果良好。