通过直接训练的深度脉冲Q网络实现人类水平的控制。

Human-Level Control Through Directly Trained Deep Spiking Q-Networks.

作者信息

Liu Guisong, Deng Wenjie, Xie Xiurui, Huang Li, Tang Huajin

出版信息

IEEE Trans Cybern. 2023 Nov;53(11):7187-7198. doi: 10.1109/TCYB.2022.3198259. Epub 2023 Oct 17.

DOI:10.1109/TCYB.2022.3198259

Abstract

As the third-generation neural networks, spiking neural networks (SNNs) have great potential on neuromorphic hardware because of their high energy efficiency. However, deep spiking reinforcement learning (DSRL), that is, the reinforcement learning (RL) based on SNNs, is still in its preliminary stage due to the binary output and the nondifferentiable property of the spiking function. To address these issues, we propose a deep spiking Q -network (DSQN) in this article. Specifically, we propose a directly trained DSRL architecture based on the leaky integrate-and-fire (LIF) neurons and deep Q -network (DQN). Then, we adapt a direct spiking learning algorithm for the DSQN. We further demonstrate the advantages of using LIF neurons in DSQN theoretically. Comprehensive experiments have been conducted on 17 top-performing Atari games to compare our method with the state-of-the-art conversion method. The experimental results demonstrate the superiority of our method in terms of performance, stability, generalization and energy efficiency. To the best of our knowledge, our work is the first one to achieve state-of-the-art performance on multiple Atari games with the directly trained SNN.

摘要

作为第三代神经网络，脉冲神经网络（SNNs）因其高能效而在神经形态硬件上具有巨大潜力。然而，深度脉冲强化学习（DSRL），即基于SNNs的强化学习，由于脉冲函数的二值输出和不可微特性，仍处于初步阶段。为了解决这些问题，我们在本文中提出了一种深度脉冲Q网络（DSQN）。具体而言，我们提出了一种基于泄漏积分发放（LIF）神经元和深度Q网络（DQN）的直接训练DSRL架构。然后，我们为DSQN采用了一种直接脉冲学习算法。我们进一步从理论上证明了在DSQN中使用LIF神经元的优势。我们在17款顶级雅达利游戏上进行了全面实验，将我们的方法与最先进的转换方法进行比较。实验结果证明了我们的方法在性能、稳定性、泛化能力和能源效率方面的优越性。据我们所知，我们的工作是首个通过直接训练的SNN在多个雅达利游戏上取得最先进性能的研究。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

通过直接训练的深度脉冲Q网络实现人类水平的控制。

Human-Level Control Through Directly Trained Deep Spiking Q-Networks.

作者信息

出版信息

相似文献

引用本文的文献

通过直接训练的深度脉冲Q网络实现人类水平的控制。

Human-Level Control Through Directly Trained Deep Spiking Q-Networks.

作者信息

出版信息

相似文献

引用本文的文献