Suppr超能文献

在线权重噪声注入式训练算法在 MLPs 上的收敛性分析。

Convergence analyses on on-line weight noise injection-based training algorithms for MLPs.

出版信息

IEEE Trans Neural Netw Learn Syst. 2012 Nov;23(11):1827-40. doi: 10.1109/TNNLS.2012.2210243.

Abstract

Injecting weight noise during training is a simple technique that has been proposed for almost two decades. However, little is known about its convergence behavior. This paper studies the convergence of two weight noise injection-based training algorithms, multiplicative weight noise injection with weight decay and additive weight noise injection with weight decay. We consider that they are applied to multilayer perceptrons either with linear or sigmoid output nodes. Let w(t) be the weight vector, let V(w) be the corresponding objective function of the training algorithm, let α >; 0 be the weight decay constant, and let μ(t) be the step size. We show that if μ(t)→ 0, then with probability one E[||w(t)||2(2)] is bound and lim(t) → ∞ ||w(t)||2 exists. Based on these two properties, we show that if μ(t)→ 0, Σtμ(t)=∞, and Σtμ(t)(2) <; ∞, then with probability one these algorithms converge. Moreover, w(t) converges with probability one to a point where ∇wV(w)=0.

摘要

在训练过程中注入权重噪声是一种简单的技术,已经提出了将近二十年。然而,人们对其收敛行为知之甚少。本文研究了两种基于权重噪声注入的训练算法的收敛性,即带有权重衰减的乘法权重噪声注入和带有权重衰减的加法权重噪声注入。我们假设它们应用于具有线性或 sigmoid 输出节点的多层感知器。令 w(t) 为权重向量,令 V(w) 为训练算法的相应目标函数,令 α>0 为权重衰减常数,令 μ(t) 为步长。我们证明,如果 μ(t)→0,则必然有 E[||w(t)||2(2)] 有界且 lim(t)→∞||w(t)||2 存在。基于这两个性质,我们证明了如果 μ(t)→0,Σtμ(t)=∞,并且 Σtμ(t)(2)<∞,则这些算法必然会收敛。此外,w(t)必然以概率收敛到一个点,在该点处 ∇wV(w)=0。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验