• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

探索用于机器人任务中深度强化学习的脉冲神经网络。

Exploring spiking neural networks for deep reinforcement learning in robotic tasks.

作者信息

Zanatta Luca, Barchi Francesco, Manoni Simone, Tolu Silvia, Bartolini Andrea, Acquaviva Andrea

机构信息

Department of Electrical, Electronic, and Information Engineering "Guglielmo Marconi", Università di Bologna, 40126, Bologna, Italy.

Department of Electrical and Photonics Engineering Automation and Control, Danmarks Tekniske Universitet, 2800, Lyngby-Taarbæk, Denmark.

出版信息

Sci Rep. 2024 Dec 28;14(1):30648. doi: 10.1038/s41598-024-77779-8.

DOI:10.1038/s41598-024-77779-8
PMID:39730367
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11680704/
Abstract

Spiking Neural Networks (SNNs) stand as the third generation of Artificial Neural Networks (ANNs), mirroring the functionality of the mammalian brain more closely than their predecessors. Their computational units, spiking neurons, characterized by Ordinary Differential Equations (ODEs), allow for dynamic system representation, with spikes serving as the medium for asynchronous communication among neurons. Due to their inherent ability to capture input dynamics, SNNs hold great promise for deep networks in Reinforcement Learning (RL) tasks. Deep RL (DRL), and in particular Proximal Policy Optimization (PPO) has been proven to be valuable for training robots due to the difficulty in creating comprehensive offline datasets that capture all environmental features. DRL combined with SNNs offers a compelling solution for tasks characterized by temporal complexity. In this work, we study the effectiveness of SNNs on DRL tasks leveraging a novel framework we developed for training SNNs with PPO in the Isaac Gym simulator implemented using the skrl library. Thanks to its significantly faster training speed compared to available SNN DRL tools, the framework allowed us to: (i) Perform an effective exploration of SNN configurations for DRL robotic tasks; (ii) Compare SNNs and ANNs for various network configurations such as the number of layers and neurons. Our work demonstrates that in DRL tasks the optimal SNN topology has a lower number of layers than ANN and we highlight how the state-of-art SNN architectures used in complex RL tasks, such as Ant, SNNs have difficulties fully leveraging deeper layers. Finally, we applied the best topology identified thanks to our Isaac Gym-based framework on Ant-v4 benchmark running on MuJoCo simulator, exhibiting a performance improvement by a factor of 4.4 over the state-of-art SNN trained on the same task.

摘要

脉冲神经网络(SNNs)是第三代人工神经网络(ANNs),比其前身更紧密地模拟哺乳动物大脑的功能。它们的计算单元,即脉冲神经元,由常微分方程(ODEs)表征,允许动态系统表示,脉冲作为神经元之间异步通信的媒介。由于其固有的捕捉输入动态的能力,SNNs在强化学习(RL)任务中的深度网络方面具有很大的潜力。深度强化学习(DRL),特别是近端策略优化(PPO),由于难以创建捕获所有环境特征的全面离线数据集,已被证明对训练机器人很有价值。DRL与SNNs相结合为具有时间复杂性的任务提供了一个引人注目的解决方案。在这项工作中,我们利用我们开发的一个新颖框架,在使用skrl库实现的Isaac Gym模拟器中用PPO训练SNNs,研究SNNs在DRL任务上的有效性。由于与现有的SNN DRL工具相比,其训练速度明显更快,该框架使我们能够:(i)对DRL机器人任务的SNN配置进行有效探索;(ii)比较SNNs和ANNs在各种网络配置(如层数和神经元数量)方面的情况。我们的工作表明,在DRL任务中,最优的SNN拓扑结构的层数比ANN少,并且我们强调了在复杂RL任务(如蚂蚁任务)中使用的最先进的SNN架构,SNNs在充分利用更深层方面存在困难。最后,我们将基于Isaac Gym的框架所确定的最佳拓扑结构应用于在MuJoCo模拟器上运行的Ant-v4基准测试,与在同一任务上训练的最先进的SNN相比,性能提高了4.4倍。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e9c/11680704/4562a3d07486/41598_2024_77779_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e9c/11680704/901ec9084a22/41598_2024_77779_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e9c/11680704/5c850882e7f7/41598_2024_77779_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e9c/11680704/27b3ea2745b4/41598_2024_77779_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e9c/11680704/2363aa985fb7/41598_2024_77779_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e9c/11680704/76723557caac/41598_2024_77779_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e9c/11680704/b1fc7fb54cc2/41598_2024_77779_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e9c/11680704/e5ce03c319cb/41598_2024_77779_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e9c/11680704/4562a3d07486/41598_2024_77779_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e9c/11680704/901ec9084a22/41598_2024_77779_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e9c/11680704/5c850882e7f7/41598_2024_77779_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e9c/11680704/27b3ea2745b4/41598_2024_77779_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e9c/11680704/2363aa985fb7/41598_2024_77779_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e9c/11680704/76723557caac/41598_2024_77779_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e9c/11680704/b1fc7fb54cc2/41598_2024_77779_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e9c/11680704/e5ce03c319cb/41598_2024_77779_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e9c/11680704/4562a3d07486/41598_2024_77779_Fig8_HTML.jpg

相似文献

1
Exploring spiking neural networks for deep reinforcement learning in robotic tasks.探索用于机器人任务中深度强化学习的脉冲神经网络。
Sci Rep. 2024 Dec 28;14(1):30648. doi: 10.1038/s41598-024-77779-8.
2
Toward robust and scalable deep spiking reinforcement learning.迈向稳健且可扩展的深度脉冲强化学习。
Front Neurorobot. 2023 Jan 20;16:1075647. doi: 10.3389/fnbot.2022.1075647. eCollection 2022.
3
Rethinking the performance comparison between SNNS and ANNS.重新思考 SNNS 和 ANNS 的性能比较。
Neural Netw. 2020 Jan;121:294-307. doi: 10.1016/j.neunet.2019.09.005. Epub 2019 Sep 19.
4
Training Spiking Neural Networks for Reinforcement Learning Tasks With Temporal Coding Method.使用时间编码方法为强化学习任务训练脉冲神经网络。
Front Neurosci. 2022 Aug 17;16:877701. doi: 10.3389/fnins.2022.877701. eCollection 2022.
5
Optimizing Deeper Spiking Neural Networks for Dynamic Vision Sensing.深度尖峰神经网络在动态视觉传感中的优化。
Neural Netw. 2021 Dec;144:686-698. doi: 10.1016/j.neunet.2021.09.022. Epub 2021 Oct 5.
6
Enabling Spike-Based Backpropagation for Training Deep Neural Network Architectures.实现基于尖峰的反向传播以训练深度神经网络架构。
Front Neurosci. 2020 Feb 28;14:119. doi: 10.3389/fnins.2020.00119. eCollection 2020.
7
Training much deeper spiking neural networks with a small number of time-steps.用少量时间步训练更深的尖峰神经网络。
Neural Netw. 2022 Sep;153:254-268. doi: 10.1016/j.neunet.2022.06.001. Epub 2022 Jun 15.
8
Quantization Framework for Fast Spiking Neural Networks.快速脉冲神经网络的量化框架
Front Neurosci. 2022 Jul 19;16:918793. doi: 10.3389/fnins.2022.918793. eCollection 2022.
9
SSTDP: Supervised Spike Timing Dependent Plasticity for Efficient Spiking Neural Network Training.SSTDP:用于高效脉冲神经网络训练的监督式脉冲时间依赖可塑性
Front Neurosci. 2021 Nov 4;15:756876. doi: 10.3389/fnins.2021.756876. eCollection 2021.
10
A universal ANN-to-SNN framework for achieving high accuracy and low latency deep Spiking Neural Networks.一种通用的 ANN-to-SNN 框架,可实现高精度和低延迟的深度尖峰神经网络。
Neural Netw. 2024 Jun;174:106244. doi: 10.1016/j.neunet.2024.106244. Epub 2024 Mar 15.

引用本文的文献

1
Investigation of Signal Transmission Dynamics in Rulkov Neuronal Networks with Q-Learned Pathways.具有Q学习路径的鲁尔科夫神经元网络中信号传输动力学的研究。
Entropy (Basel). 2025 Aug 21;27(8):884. doi: 10.3390/e27080884.
2
AI-Driven Control Strategies for Biomimetic Robotics: Trends, Challenges, and Future Directions.用于仿生机器人的人工智能驱动控制策略:趋势、挑战与未来方向
Biomimetics (Basel). 2025 Jul 14;10(7):460. doi: 10.3390/biomimetics10070460.
3
Near real-time online reinforcement learning with synchronous or asynchronous updates.

本文引用的文献

1
Fully Spiking Actor Network With Intralayer Connections for Reinforcement Learning.用于强化学习的具有层内连接的全尖峰神经元智能体网络
IEEE Trans Neural Netw Learn Syst. 2025 Feb;36(2):2881-2893. doi: 10.1109/TNNLS.2024.3352653. Epub 2025 Feb 6.
2
Effective Surrogate Gradient Learning With High-Order Information Bottleneck for Spike-Based Machine Intelligence.基于尖峰的机器智能的具有高阶信息瓶颈的有效替代梯度学习
IEEE Trans Neural Netw Learn Syst. 2025 Jan;36(1):1734-1748. doi: 10.1109/TNNLS.2023.3329525. Epub 2025 Jan 7.
3
Solving the spike feature information vanishing problem in spiking deep Q network with potential based normalization.
具有同步或异步更新的近实时在线强化学习。
Sci Rep. 2025 May 17;15(1):17158. doi: 10.1038/s41598-025-00492-7.
基于势归一化解决脉冲深度Q网络中的脉冲特征信息消失问题。
Front Neurosci. 2022 Aug 25;16:953368. doi: 10.3389/fnins.2022.953368. eCollection 2022.
4
Spiking Deep Residual Networks.尖峰深度残差网络
IEEE Trans Neural Netw Learn Syst. 2023 Aug;34(8):5200-5205. doi: 10.1109/TNNLS.2021.3119238. Epub 2023 Aug 4.
5
Combining STDP and binary networks for reinforcement learning from images and sparse rewards.结合 STDP 和二进制网络,从图像和稀疏奖励中进行强化学习。
Neural Netw. 2021 Dec;144:496-506. doi: 10.1016/j.neunet.2021.09.010. Epub 2021 Sep 17.
6
LiDAR-driven spiking neural network for collision avoidance in autonomous driving.激光雷达驱动的尖峰神经网络在自动驾驶中的避撞应用。
Bioinspir Biomim. 2021 Oct 25;16(6). doi: 10.1088/1748-3190/ac290c.
7
Learning agile and dynamic motor skills for legged robots.学习用于腿部机器人的敏捷和动态运动技能。
Sci Robot. 2019 Jan 16;4(26). doi: 10.1126/scirobotics.aau5872.
8
A solution to the learning dilemma for recurrent networks of spiking neurons.用于尖峰神经元递归网络的学习困境的解决方案。
Nat Commun. 2020 Jul 17;11(1):3625. doi: 10.1038/s41467-020-17236-y.
9
Spatial Properties of STDP in a Self-Learning Spiking Neural Network Enable Controlling a Mobile Robot.自学习脉冲神经网络中突触可塑性的空间特性助力移动机器人控制
Front Neurosci. 2020 Feb 26;14:88. doi: 10.3389/fnins.2020.00088. eCollection 2020.
10
Reinforcement Learning in Spiking Neural Networks with Stochastic and Deterministic Synapses.具有随机和确定性突触的尖峰神经网络中的强化学习。
Neural Comput. 2019 Dec;31(12):2368-2389. doi: 10.1162/neco_a_01238. Epub 2019 Oct 15.