• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

梯度下降学习的一个局限性。

A Limitation of Gradient Descent Learning.

作者信息

Sum John, Leung Chi-Sing, Ho Kevin

出版信息

IEEE Trans Neural Netw Learn Syst. 2020 Jun;31(6):2227-2232. doi: 10.1109/TNNLS.2019.2927689. Epub 2019 Aug 6.

DOI:10.1109/TNNLS.2019.2927689
PMID:31398136
Abstract

Over decades, gradient descent has been applied to develop learning algorithm to train a neural network (NN). In this brief, a limitation of applying such algorithm to train an NN with persistent weight noise is revealed. Let V(w) be the performance measure of an ideal NN. V(w) is applied to develop the gradient descent learning (GDL). With weight noise, the desired performance measure (denoted as J(w) ) is E[V(~w)|w] , where ~w is the noisy weight vector. Applying GDL to train an NN with weight noise, the actual learning objective is clearly not V(w) but another scalar function L(w) . For decades, there is a misconception that L(w) = J(w) , and hence, the actual model attained by the GDL is the desired model. However, we show that it might not: 1) with persistent additive weight noise, the actual model attained is the desired model as L(w) = J(w) ; and 2) with persistent multiplicative weight noise, the actual model attained is unlikely the desired model as L(w) ≠ J(w) . Accordingly, the properties of the models attained as compared with the desired models are analyzed and the learning curves are sketched. Simulation results on 1) a simple regression problem and 2) the MNIST handwritten digit recognition are presented to support our claims.

摘要

几十年来,梯度下降已被用于开发学习算法来训练神经网络(NN)。在本简报中,揭示了将这种算法应用于训练具有持续权重噪声的神经网络的一个局限性。设V(w)为理想神经网络的性能度量。V(w)被用于开发梯度下降学习(GDL)。对于存在权重噪声的情况,期望的性能度量(记为J(w))是E[V(w)|w],其中w是有噪声的权重向量。将GDL应用于训练具有权重噪声的神经网络时,实际的学习目标显然不是V(w),而是另一个标量函数L(w)。几十年来,存在一种误解,认为L(w)=J(w),因此,通过GDL获得的实际模型就是期望的模型。然而,我们表明可能并非如此:1)对于持续的加性权重噪声,获得的实际模型是期望的模型,因为L(w)=J(w);2)对于持续的乘性权重噪声,获得的实际模型不太可能是期望的模型,因为L(w)≠J(w)。相应地,分析了所获得的模型与期望模型相比的属性,并绘制了学习曲线。给出了关于1)一个简单回归问题和2)MNIST手写数字识别的仿真结果来支持我们的观点。

相似文献

1
A Limitation of Gradient Descent Learning.梯度下降学习的一个局限性。
IEEE Trans Neural Netw Learn Syst. 2020 Jun;31(6):2227-2232. doi: 10.1109/TNNLS.2019.2927689. Epub 2019 Aug 6.
2
Regularization Effect of Random Node Fault/Noise on Gradient Descent Learning Algorithm.随机节点故障/噪声对梯度下降学习算法的正则化效应
IEEE Trans Neural Netw Learn Syst. 2023 May;34(5):2619-2632. doi: 10.1109/TNNLS.2021.3107051. Epub 2023 May 2.
3
Convergence analyses on on-line weight noise injection-based training algorithms for MLPs.在线权重噪声注入式训练算法在 MLPs 上的收敛性分析。
IEEE Trans Neural Netw Learn Syst. 2012 Nov;23(11):1827-40. doi: 10.1109/TNNLS.2012.2210243.
4
Learning Algorithm for Boltzmann Machines With Additive Weight and Bias Noise.
IEEE Trans Neural Netw Learn Syst. 2019 Oct;30(10):3200-3204. doi: 10.1109/TNNLS.2018.2889072. Epub 2019 Jan 18.
5
Reinforcement-learning-based dual-control methodology for complex nonlinear discrete-time systems with application to spark engine EGR operation.基于强化学习的复杂非线性离散时间系统双控制方法及其在火花发动机废气再循环操作中的应用
IEEE Trans Neural Netw. 2008 Aug;19(8):1369-88. doi: 10.1109/TNN.2008.2000452.
6
The removal of EMG in EEG by neural networks.神经网络去除 EEG 中的 EMG。
Physiol Meas. 2010 Dec;31(12):1567-84. doi: 10.1088/0967-3334/31/12/002. Epub 2010 Oct 27.
7
Learning curves for stochastic gradient descent in linear feedforward networks.线性前馈网络中随机梯度下降的学习曲线。
Neural Comput. 2005 Dec;17(12):2699-718. doi: 10.1162/089976605774320539.
8
Direct Error-Driven Learning for Deep Neural Networks With Applications to Big Data.基于直接错误驱动学习的深度神经网络及其在大数据中的应用。
IEEE Trans Neural Netw Learn Syst. 2020 May;31(5):1763-1770. doi: 10.1109/TNNLS.2019.2920964. Epub 2019 Jul 15.
9
Accuracy Maximization Analysis for Sensory-Perceptual Tasks: Computational Improvements, Filter Robustness, and Coding Advantages for Scaled Additive Noise.感觉-知觉任务的精度最大化分析:针对比例加性噪声的计算改进、滤波器鲁棒性及编码优势
PLoS Comput Biol. 2017 Feb 8;13(2):e1005281. doi: 10.1371/journal.pcbi.1005281. eCollection 2017 Feb.
10
On objective function, regularizer, and prediction error of a learning algorithm for dealing with multiplicative weight noise.关于处理乘性权重噪声的学习算法的目标函数、正则化器和预测误差。
IEEE Trans Neural Netw. 2009 Jan;20(1):124-38. doi: 10.1109/TNN.2008.2005596. Epub 2008 Dec 22.

引用本文的文献

1
Machine Learning Can Predict the Probability of Biologic Therapy in Patients with Inflammatory Bowel Disease.机器学习可预测炎症性肠病患者接受生物治疗的概率。
J Clin Med. 2022 Aug 5;11(15):4586. doi: 10.3390/jcm11154586.