使用NIC优化正则化提高自然梯度学习的泛化性能。

Improving generalization performance of natural gradient learning using optimized regularization by NIC.

作者信息

Park Hyeyoung, Murata Noboru, Amari Shun-Ichi

机构信息

Brain Science Institute, RIKEN, Saitama, Japan.

出版信息

Neural Comput. 2004 Feb;16(2):355-82. doi: 10.1162/089976604322742065.

DOI:10.1162/089976604322742065

PMID:15006100

Abstract

Natural gradient learning is known to be efficient in escaping plateau, which is a main cause of the slow learning speed of neural networks. The adaptive natural gradient learning method for practical implementation also has been developed, and its advantage in real-world problems has been confirmed. In this letter, we deal with the generalization performances of the natural gradient method. Since natural gradient learning makes parameters fit to training data quickly,the overfitting phenomenon may easily occur, which results in poor generalization performance. To solve the problem, we introduce the regularization term in natural gradient learning and propose an efficient optimizing method for the scale of regularization by using a generalized Akaike information criterion (network information criterion). We discuss the properties of the optimized regularization strength by NIC through theoretical analysis as well as computer simulations. We confirm the computational efficiency and generalization performance of the proposed method in real-world applications through computational experiments on benchmark problems.

摘要

自然梯度学习在逃离平台期方面是高效的，而平台期是神经网络学习速度缓慢的主要原因。适用于实际应用的自适应自然梯度学习方法也已被开发出来，并且其在实际问题中的优势已得到证实。在这封信中，我们探讨自然梯度方法的泛化性能。由于自然梯度学习能使参数快速拟合训练数据，因此可能容易出现过拟合现象，这会导致泛化性能较差。为了解决这个问题，我们在自然梯度学习中引入正则化项，并使用广义赤池信息准则（网络信息准则）提出一种针对正则化尺度的高效优化方法。我们通过理论分析以及计算机模拟来讨论由网络信息准则优化后的正则化强度的性质。我们通过对基准问题的计算实验，在实际应用中证实了所提方法的计算效率和泛化性能。