• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过直接训练的深度脉冲Q网络实现人类水平的控制。

Human-Level Control Through Directly Trained Deep Spiking Q-Networks.

作者信息

Liu Guisong, Deng Wenjie, Xie Xiurui, Huang Li, Tang Huajin

出版信息

IEEE Trans Cybern. 2023 Nov;53(11):7187-7198. doi: 10.1109/TCYB.2022.3198259. Epub 2023 Oct 17.

DOI:10.1109/TCYB.2022.3198259
PMID:36063509
Abstract

As the third-generation neural networks, spiking neural networks (SNNs) have great potential on neuromorphic hardware because of their high energy efficiency. However, deep spiking reinforcement learning (DSRL), that is, the reinforcement learning (RL) based on SNNs, is still in its preliminary stage due to the binary output and the nondifferentiable property of the spiking function. To address these issues, we propose a deep spiking Q -network (DSQN) in this article. Specifically, we propose a directly trained DSRL architecture based on the leaky integrate-and-fire (LIF) neurons and deep Q -network (DQN). Then, we adapt a direct spiking learning algorithm for the DSQN. We further demonstrate the advantages of using LIF neurons in DSQN theoretically. Comprehensive experiments have been conducted on 17 top-performing Atari games to compare our method with the state-of-the-art conversion method. The experimental results demonstrate the superiority of our method in terms of performance, stability, generalization and energy efficiency. To the best of our knowledge, our work is the first one to achieve state-of-the-art performance on multiple Atari games with the directly trained SNN.

摘要

作为第三代神经网络,脉冲神经网络(SNNs)因其高能效而在神经形态硬件上具有巨大潜力。然而,深度脉冲强化学习(DSRL),即基于SNNs的强化学习,由于脉冲函数的二值输出和不可微特性,仍处于初步阶段。为了解决这些问题,我们在本文中提出了一种深度脉冲Q网络(DSQN)。具体而言,我们提出了一种基于泄漏积分发放(LIF)神经元和深度Q网络(DQN)的直接训练DSRL架构。然后,我们为DSQN采用了一种直接脉冲学习算法。我们进一步从理论上证明了在DSQN中使用LIF神经元的优势。我们在17款顶级雅达利游戏上进行了全面实验,将我们的方法与最先进的转换方法进行比较。实验结果证明了我们的方法在性能、稳定性、泛化能力和能源效率方面的优越性。据我们所知,我们的工作是首个通过直接训练的SNN在多个雅达利游戏上取得最先进性能的研究。

相似文献

1
Human-Level Control Through Directly Trained Deep Spiking Q-Networks.通过直接训练的深度脉冲Q网络实现人类水平的控制。
IEEE Trans Cybern. 2023 Nov;53(11):7187-7198. doi: 10.1109/TCYB.2022.3198259. Epub 2023 Oct 17.
2
Toward robust and scalable deep spiking reinforcement learning.迈向稳健且可扩展的深度脉冲强化学习。
Front Neurorobot. 2023 Jan 20;16:1075647. doi: 10.3389/fnbot.2022.1075647. eCollection 2022.
3
Solving the spike feature information vanishing problem in spiking deep Q network with potential based normalization.基于势归一化解决脉冲深度Q网络中的脉冲特征信息消失问题。
Front Neurosci. 2022 Aug 25;16:953368. doi: 10.3389/fnins.2022.953368. eCollection 2022.
4
Combining STDP and binary networks for reinforcement learning from images and sparse rewards.结合 STDP 和二进制网络,从图像和稀疏奖励中进行强化学习。
Neural Netw. 2021 Dec;144:496-506. doi: 10.1016/j.neunet.2021.09.010. Epub 2021 Sep 17.
5
Improved robustness of reinforcement learning policies upon conversion to spiking neuronal network platforms applied to Atari Breakout game.强化学习策略在转换到应用于雅达利打破块游戏的尖峰神经元网络平台后的鲁棒性提高。
Neural Netw. 2019 Dec;120:108-115. doi: 10.1016/j.neunet.2019.08.009. Epub 2019 Aug 25.
6
Optimizing Deeper Spiking Neural Networks for Dynamic Vision Sensing.深度尖峰神经网络在动态视觉传感中的优化。
Neural Netw. 2021 Dec;144:686-698. doi: 10.1016/j.neunet.2021.09.022. Epub 2021 Oct 5.
7
Enabling Spike-Based Backpropagation for Training Deep Neural Network Architectures.实现基于尖峰的反向传播以训练深度神经网络架构。
Front Neurosci. 2020 Feb 28;14:119. doi: 10.3389/fnins.2020.00119. eCollection 2020.
8
SSTDP: Supervised Spike Timing Dependent Plasticity for Efficient Spiking Neural Network Training.SSTDP:用于高效脉冲神经网络训练的监督式脉冲时间依赖可塑性
Front Neurosci. 2021 Nov 4;15:756876. doi: 10.3389/fnins.2021.756876. eCollection 2021.
9
Deep Learning With Spiking Neurons: Opportunities and Challenges.基于脉冲神经元的深度学习:机遇与挑战。
Front Neurosci. 2018 Oct 25;12:774. doi: 10.3389/fnins.2018.00774. eCollection 2018.
10
SPIDEN: deep Spiking Neural Networks for efficient image denoising.SPIDEN:用于高效图像去噪的深度脉冲神经网络。
Front Neurosci. 2023 Aug 11;17:1224457. doi: 10.3389/fnins.2023.1224457. eCollection 2023.

引用本文的文献

1
SpikingJelly: An open-source machine learning infrastructure platform for spike-based intelligence.SpikingJelly:一个用于基于尖峰的智能的开源机器学习基础架构平台。
Sci Adv. 2023 Oct 6;9(40):eadi1480. doi: 10.1126/sciadv.adi1480.
2
CBMC: A Biomimetic Approach for Control of a 7-Degree of Freedom Robotic Arm.CBMC:一种用于控制七自由度机器人手臂的仿生方法。
Biomimetics (Basel). 2023 Aug 25;8(5):389. doi: 10.3390/biomimetics8050389.
3
Critically synchronized brain waves form an effective, robust and flexible basis for human memory and learning.
关键同步脑波为人类记忆和学习提供了有效、强大且灵活的基础。
Sci Rep. 2023 Mar 16;13(1):4343. doi: 10.1038/s41598-023-31365-6.
4
Solving the spike feature information vanishing problem in spiking deep Q network with potential based normalization.基于势归一化解决脉冲深度Q网络中的脉冲特征信息消失问题。
Front Neurosci. 2022 Aug 25;16:953368. doi: 10.3389/fnins.2022.953368. eCollection 2022.