• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

赫布梯度下降:对数似然学习的统一观点。

Hebbian Descent: A Unified View on Log-Likelihood Learning.

作者信息

Melchior Jan, Schiewer Robin, Wiskott Laurenz

机构信息

Ruhr University Bochum, 44801 Bochum, Germany

出版信息

Neural Comput. 2024 Aug 19;36(9):1669-1712. doi: 10.1162/neco_a_01684.

DOI:10.1162/neco_a_01684
PMID:39163553
Abstract

This study discusses the negative impact of the derivative of the activation functions in the output layer of artificial neural networks, in particular in continual learning. We propose Hebbian descent as a theoretical framework to overcome this limitation, which is implemented through an alternative loss function for gradient descent we refer to as Hebbian descent loss. This loss is effectively the generalized log-likelihood loss and corresponds to an alternative weight update rule for the output layer wherein the derivative of the activation function is disregarded. We show how this update avoids vanishing error signals during backpropagation in saturated regions of the activation functions, which is particularly helpful in training shallow neural networks and deep neural networks where saturating activation functions are only used in the output layer. In combination with centering, Hebbian descent leads to better continual learning capabilities. It provides a unifying perspective on Hebbian learning, gradient descent, and generalized linear models, for all of which we discuss the advantages and disadvantages. Given activation functions with strictly positive derivative (as often the case in practice), Hebbian descent inherits the convergence properties of regular gradient descent. While established pairings of loss and output layer activation function (e.g., mean squared error with linear or cross-entropy with sigmoid/softmax) are subsumed by Hebbian descent, we provide general insights for designing arbitrary loss activation function combinations that benefit from Hebbian descent. For shallow networks, we show that Hebbian descent outperforms Hebbian learning, has a performance similar to regular gradient descent, and has a much better performance than all other tested update rules in continual learning. In combination with centering, Hebbian descent implements a forgetting mechanism that prevents catastrophic interference notably better than the other tested update rules. When training deep neural networks, our experimental results suggest that Hebbian descent has better or similar performance as gradient descent.

摘要

本研究探讨了人工神经网络输出层激活函数导数的负面影响,特别是在持续学习方面。我们提出赫布下降作为一个理论框架来克服这一限制,它通过一种用于梯度下降的替代损失函数来实现,我们将其称为赫布下降损失。这种损失实际上是广义对数似然损失,并且对应于输出层的一种替代权重更新规则,其中激活函数的导数被忽略。我们展示了这种更新如何在激活函数的饱和区域进行反向传播期间避免误差信号消失,这在训练仅在输出层使用饱和激活函数的浅层神经网络和深层神经网络时特别有帮助。结合中心化,赫布下降带来了更好的持续学习能力。它为赫布学习、梯度下降和广义线性模型提供了一个统一的视角,我们讨论了所有这些方法的优缺点。对于具有严格正导数的激活函数(实际中经常如此),赫布下降继承了常规梯度下降的收敛特性。虽然已有的损失函数与输出层激活函数的配对(例如,线性的均方误差或 sigmoid/softmax 的交叉熵)被赫布下降所包含,但我们提供了关于设计受益于赫布下降的任意损失 - 激活函数组合的一般见解。对于浅层网络,我们表明赫布下降优于赫布学习,具有与常规梯度下降相似的性能,并且在持续学习方面比所有其他测试的更新规则性能要好得多。结合中心化,赫布下降实现了一种遗忘机制,能比其他测试的更新规则更好地防止灾难性干扰。在训练深层神经网络时,我们的实验结果表明赫布下降具有与梯度下降相似或更好的性能。

相似文献

1
Hebbian Descent: A Unified View on Log-Likelihood Learning.赫布梯度下降:对数似然学习的统一观点。
Neural Comput. 2024 Aug 19;36(9):1669-1712. doi: 10.1162/neco_a_01684.
2
Power Function Error Initialization Can Improve Convergence of Backpropagation Learning in Neural Networks for Classification.幂函数误差初始化可改善神经网络分类中反向传播学习的收敛性。
Neural Comput. 2021 Jul 26;33(8):2193-2225. doi: 10.1162/neco_a_01407.
3
Merging Back-propagation and Hebbian Learning Rules for Robust Classifications.融合反向传播和赫布学习规则以实现稳健分类
Neural Netw. 1996 Oct;9(7):1213-1222. doi: 10.1016/0893-6080(96)00042-1.
4
Learning smooth dendrite morphological neurons by stochastic gradient descent for pattern classification.通过随机梯度下降学习用于模式分类的平滑树突形态神经元。
Neural Netw. 2023 Nov;168:665-676. doi: 10.1016/j.neunet.2023.09.033. Epub 2023 Sep 25.
5
Deep convolutional neural network and IoT technology for healthcare.用于医疗保健的深度卷积神经网络和物联网技术。
Digit Health. 2024 Jan 17;10:20552076231220123. doi: 10.1177/20552076231220123. eCollection 2024 Jan-Dec.
6
Hebbian semi-supervised learning in a sample efficiency setting.Hebbian 半监督学习在样本效率设置下。
Neural Netw. 2021 Nov;143:719-731. doi: 10.1016/j.neunet.2021.08.003. Epub 2021 Aug 13.
7
Accelerating the training of feedforward neural networks using generalized Hebbian rules for initializing the internal representations.使用广义赫布规则初始化内部表示以加速前馈神经网络的训练。
IEEE Trans Neural Netw. 1996;7(2):419-26. doi: 10.1109/72.485677.
8
A theory of local learning, the learning channel, and the optimality of backpropagation.一种关于局部学习、学习通道及反向传播最优性的理论。
Neural Netw. 2016 Nov;83:51-74. doi: 10.1016/j.neunet.2016.07.006. Epub 2016 Aug 5.
9
Learning cortical hierarchies with temporal Hebbian updates.通过时间赫布更新学习皮层层次结构。
Front Comput Neurosci. 2023 May 24;17:1136010. doi: 10.3389/fncom.2023.1136010. eCollection 2023.
10
Modelling continual learning in humans with Hebbian context gating and exponentially decaying task signals.利用赫布式上下文门控和指数衰减任务信号对人类进行连续学习建模。
PLoS Comput Biol. 2023 Jan 19;19(1):e1010808. doi: 10.1371/journal.pcbi.1010808. eCollection 2023 Jan.