神经拟态强化学习探索：基于生物启发神经网络的探索-利用平衡计算方法。

Exploration in neo-Hebbian reinforcement learning: Computational approaches to the exploration-exploitation balance with bio-inspired neural networks.

机构信息

The Center for Advanced Computer Studies, University of Louisiana at Lafayette, 301 East Lewis Street, P.O. Box 43694, Lafayette, LA 70504-3694, United States of America.

出版信息

Neural Netw. 2022 Jul;151:16-33. doi: 10.1016/j.neunet.2022.03.021. Epub 2022 Mar 23.

DOI:10.1016/j.neunet.2022.03.021

PMID:35367735

Abstract

Recent theoretical and experimental works have connected Hebbian plasticity with the reinforcement learning (RL) paradigm, producing a class of trial-and-error learning in artificial neural networks known as neo-Hebbian plasticity. Inspired by the role of the neuromodulator dopamine in synaptic modification, neo-Hebbian RL methods extend unsupervised Hebbian learning rules with value-based modulation to selectively reinforce associations. This reinforcement allows for learning exploitative behaviors and produces RL models with strong biological plausibility. The review begins with coverage of fundamental concepts in rate- and spike-coded models. We introduce Hebbian correlation detection as a basis for modification of synaptic weighting and progress to neo-Hebbian RL models guided solely by extrinsic rewards. We then analyze state-of-the-art neo-Hebbian approaches to the exploration-exploitation balance under the RL paradigm, emphasizing works that employ additional mechanics to modulate that dynamic. Our review of neo-Hebbian RL methods in this context indicates substantial potential for novel improvements in exploratory learning, primarily through stronger incorporation of intrinsic motivators. We provide a number of research suggestions for this pursuit by drawing from modern theories and results in neuroscience and psychology. The exploration-exploitation balance is a central issue in RL research, and this review is the first to focus on it under the neo-Hebbian RL framework.

摘要

最近的理论和实验工作将赫布可塑性与强化学习（RL）范式联系起来，在人工神经网络中产生了一类称为新赫布可塑性的试错学习。受神经调质多巴胺在突触修饰中的作用的启发，新赫布 RL 方法用基于价值的调制扩展了无监督赫布学习规则，以选择性地增强关联。这种强化允许学习剥削性行为，并产生具有很强生物学合理性的 RL 模型。综述首先介绍了速率和尖峰编码模型中的基本概念。我们介绍了赫布相关性检测作为突触权重修改的基础，并进一步发展为仅受外在奖励指导的新赫布 RL 模型。然后，我们根据 RL 范式分析了最先进的新赫布方法在探索-利用平衡方面的情况，强调了那些利用额外机制来调节这种动态的工作。我们通过借鉴神经科学和心理学中的现代理论和结果，对新赫布 RL 方法进行了评估，表明在探索性学习方面有很大的改进潜力，主要是通过更强烈地结合内在激励因素。

相似文献

Exploration in neo-Hebbian reinforcement learning: Computational approaches to the exploration-exploitation balance with bio-inspired neural networks.

Neural Netw. 2022 Jul;151:16-33. doi: 10.1016/j.neunet.2022.03.021. Epub 2022 Mar 23.

A reinforcement learning framework for spiking networks with dynamic synapses.

Comput Intell Neurosci. 2011;2011:869348. doi: 10.1155/2011/869348. Epub 2011 Oct 23.

Deep Reinforcement Learning With Modulated Hebbian Plus Q-Network Architecture.

IEEE Trans Neural Netw Learn Syst. 2022 May;33(5):2045-2056. doi: 10.1109/TNNLS.2021.3110281. Epub 2022 May 2.

Combining STDP and binary networks for reinforcement learning from images and sparse rewards.

Neural Netw. 2021 Dec;144:496-506. doi: 10.1016/j.neunet.2021.09.010. Epub 2021 Sep 17.

Neuro-Inspired Reinforcement Learning to Improve Trajectory Prediction in Reward-Guided Behavior.

Int J Neural Syst. 2022 Sep;32(9):2250038. doi: 10.1142/S0129065722500381. Epub 2022 Aug 19.

Classic Hebbian learning endows feed-forward networks with sufficient adaptability in challenging reinforcement learning tasks.

J Neurophysiol. 2021 Jun 1;125(6):2034-2037. doi: 10.1152/jn.00712.2020. Epub 2021 Apr 28.

Asymmetric and adaptive reward coding via normalized reinforcement learning.

PLoS Comput Biol. 2022 Jul 21;18(7):e1010350. doi: 10.1371/journal.pcbi.1010350. eCollection 2022 Jul.

Reinforcement Learning in Spiking Neural Networks with Stochastic and Deterministic Synapses.

Neural Comput. 2019 Dec;31(12):2368-2389. doi: 10.1162/neco_a_01238. Epub 2019 Oct 15.

Nutrient-Sensitive Reinforcement Learning in Monkeys.

J Neurosci. 2023 Mar 8;43(10):1714-1730. doi: 10.1523/JNEUROSCI.0752-22.2022. Epub 2023 Jan 20.

Memory-Dependent Computation and Learning in Spiking Neural Networks Through Hebbian Plasticity.

IEEE Trans Neural Netw Learn Syst. 2025 Feb;36(2):2551-2562. doi: 10.1109/TNNLS.2023.3341446. Epub 2025 Feb 6.

引用本文的文献

On Predictive Planning and Counterfactual Learning in Active Inference.

Entropy (Basel). 2024 May 31;26(6):484. doi: 10.3390/e26060484.

Inhibition of Dopamine Neurons Prevents Incentive Value Encoding of a Reward Cue: With Revelations from Deep Phenotyping.

J Neurosci. 2023 Nov 1;43(44):7376-7392. doi: 10.1523/JNEUROSCI.0848-23.2023. Epub 2023 Sep 14.

Inhibition of dopamine neurons prevents incentive value encoding of a reward cue: With revelations from deep phenotyping.

bioRxiv. 2023 May 5:2023.05.03.539324. doi: 10.1101/2023.05.03.539324.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

神经拟态强化学习探索：基于生物启发神经网络的探索-利用平衡计算方法。

Exploration in neo-Hebbian reinforcement learning: Computational approaches to the exploration-exploitation balance with bio-inspired neural networks.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献