Institute of Theoretical Computer Science, Graz University of Technology, Inffeldgasse 16b, Graz, Austria.
Nat Commun. 2020 Jul 17;11(1):3625. doi: 10.1038/s41467-020-17236-y.
Recurrently connected networks of spiking neurons underlie the astounding information processing capabilities of the brain. Yet in spite of extensive research, how they can learn through synaptic plasticity to carry out complex network computations remains unclear. We argue that two pieces of this puzzle were provided by experimental data from neuroscience. A mathematical result tells us how these pieces need to be combined to enable biologically plausible online network learning through gradient descent, in particular deep reinforcement learning. This learning method-called e-prop-approaches the performance of backpropagation through time (BPTT), the best-known method for training recurrent neural networks in machine learning. In addition, it suggests a method for powerful on-chip learning in energy-efficient spike-based hardware for artificial intelligence.
脉冲神经元的反复连接网络是大脑惊人的信息处理能力的基础。然而,尽管进行了广泛的研究,它们如何通过突触可塑性来进行复杂的网络计算仍然不清楚。我们认为,这个难题的两个部分是由神经科学的实验数据提供的。一个数学结果告诉我们,这两个部分需要结合起来,才能通过梯度下降实现生物上合理的在线网络学习,特别是深度强化学习。这种学习方法——e-prop——接近时间反向传播(BPTT)的性能,BPTT 是机器学习中训练递归神经网络的最著名方法。此外,它还为基于人工智能的节能基于尖峰的硬件中的强大片上学习提供了一种方法。