Suppr超能文献

将反向传播与平衡传播相结合以改进演员-评论家强化学习框架。

Combining backpropagation with Equilibrium Propagation to improve an Actor-Critic reinforcement learning framework.

作者信息

Kubo Yoshimasa, Chalmers Eric, Luczak Artur

机构信息

Canadian Centre for Behavioural Neuroscience, University of Lethbridge, Lethbridge, AB, Canada.

Department of Mathematics and Computing, Mount Royal University, Calgary, AB, Canada.

出版信息

Front Comput Neurosci. 2022 Aug 23;16:980613. doi: 10.3389/fncom.2022.980613. eCollection 2022.

Abstract

Backpropagation (BP) has been used to train neural networks for many years, allowing them to solve a wide variety of tasks like image classification, speech recognition, and reinforcement learning tasks. But the biological plausibility of BP as a mechanism of neural learning has been questioned. Equilibrium Propagation (EP) has been proposed as a more biologically plausible alternative and achieves comparable accuracy on the CIFAR-10 image classification task. This study proposes the first EP-based reinforcement learning architecture: an Actor-Critic architecture with the actor network trained by EP. We show that this model can solve the basic control tasks often used as benchmarks for BP-based models. Interestingly, our trained model demonstrates more consistent high-reward behavior than a comparable model trained exclusively by BP.

摘要

反向传播(BP)已被用于训练神经网络多年,使它们能够解决各种各样的任务,如图像分类、语音识别和强化学习任务。但BP作为一种神经学习机制的生物学合理性受到了质疑。平衡传播(EP)已被提出作为一种更具生物学合理性的替代方法,并在CIFAR-10图像分类任务上取得了相当的准确率。本研究提出了首个基于EP的强化学习架构:一种演员-评论家架构,其中演员网络由EP训练。我们表明,该模型可以解决通常用作基于BP模型基准的基本控制任务。有趣的是,我们训练的模型比仅由BP训练的可比模型表现出更一致的高奖励行为。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a58c/9446087/0731bfdd5b96/fncom-16-980613-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验