Suppr超能文献

随机节点故障/噪声对梯度下降学习算法的正则化效应

Regularization Effect of Random Node Fault/Noise on Gradient Descent Learning Algorithm.

作者信息

Sum John, Leung Chi-Sing

出版信息

IEEE Trans Neural Netw Learn Syst. 2023 May;34(5):2619-2632. doi: 10.1109/TNNLS.2021.3107051. Epub 2023 May 2.

Abstract

For decades, adding fault/noise during training by gradient descent has been a technique for getting a neural network (NN) tolerant to persistent fault/noise or getting an NN with better generalization. In recent years, this technique has been readvocated in deep learning to avoid overfitting. Yet, the objective function of such fault/noise injection learning has been misinterpreted as the desired measure (i.e., the expected mean squared error (mse) of the training samples) of the NN with the same fault/noise. The aims of this article are: 1) to clarify the above misconception and 2) investigate the actual regularization effect of adding node fault/noise when training by gradient descent. Based on the previous works on adding fault/noise during training, we speculate the reason why the misconception appears. In the sequel, it is shown that the learning objective of adding random node fault during gradient descent learning (GDL) for a multilayer perceptron (MLP) is identical to the desired measure of the MLP with the same fault. If additive (resp. multiplicative) node noise is added during GDL for an MLP, the learning objective is not identical to the desired measure of the MLP with such noise. For radial basis function (RBF) networks, it is shown that the learning objective is identical to the corresponding desired measure for all three fault/noise conditions. Empirical evidence is presented to support the theoretical results and, hence, clarify the misconception that the objective function of a fault/noise injection learning might not be interpreted as the desired measure of the NN with the same fault/noise. Afterward, the regularization effect of adding node fault/noise during training is revealed for the case of RBF networks. Notably, it is shown that the regularization effect of adding additive or multiplicative node noise (MNN) during training an RBF is reducing network complexity. Applying dropout regularization in RBF networks, its effect is the same as adding MNN during training.

摘要

几十年来,通过梯度下降在训练期间添加故障/噪声一直是一种使神经网络(NN)能够容忍持续故障/噪声或获得具有更好泛化能力的神经网络的技术。近年来,这种技术在深度学习中被重新提倡以避免过拟合。然而,这种故障/噪声注入学习的目标函数被误解为具有相同故障/噪声的神经网络的期望度量(即训练样本的期望均方误差(mse))。本文的目的是:1)澄清上述误解,2)研究在通过梯度下降进行训练时添加节点故障/噪声的实际正则化效果。基于之前关于在训练期间添加故障/噪声的工作,我们推测了误解出现的原因。接下来,结果表明,对于多层感知器(MLP),在梯度下降学习(GDL)期间添加随机节点故障的学习目标与具有相同故障的MLP的期望度量相同。如果在MLP的GDL期间添加加性(或乘性)节点噪声,学习目标与具有这种噪声的MLP的期望度量不同。对于径向基函数(RBF)网络,结果表明在所有三种故障/噪声条件下学习目标与相应的期望度量相同。给出了经验证据来支持理论结果,从而澄清了故障/噪声注入学习的目标函数可能不能被解释为具有相同故障/噪声的神经网络的期望度量这一误解。之后,揭示了在RBF网络的训练期间添加节点故障/噪声的正则化效果。值得注意的是,结果表明在训练RBF期间添加加性或乘性节点噪声(MNN)的正则化效果是降低网络复杂度。在RBF网络中应用随机失活正则化,其效果与在训练期间添加MNN相同。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验