Centre for Neural Circuits and Behaviour, University of Oxford, Oxford OX1 3SR, U.K., and Friedrich Miescher Institute for Biomedical Research, 4058 Basel, Switzerland,
Centre for Neural Circuits and Behaviour, University of Oxford, Oxford OX1 3SR, U.K., and Institute for Science and Technology, 3400 Klosterneuburg, Austria,
Neural Comput. 2021 Mar 26;33(4):899-925. doi: 10.1162/neco_a_01367.
Brains process information in spiking neural networks. Their intricate connections shape the diverse functions these networks perform. Yet how network connectivity relates to function is poorly understood, and the functional capabilities of models of spiking networks are still rudimentary. The lack of both theoretical insight and practical algorithms to find the necessary connectivity poses a major impediment to both studying information processing in the brain and building efficient neuromorphic hardware systems. The training algorithms that solve this problem for artificial neural networks typically rely on gradient descent. But doing so in spiking networks has remained challenging due to the nondifferentiable nonlinearity of spikes. To avoid this issue, one can employ surrogate gradients to discover the required connectivity. However, the choice of a surrogate is not unique, raising the question of how its implementation influences the effectiveness of the method. Here, we use numerical simulations to systematically study how essential design parameters of surrogate gradients affect learning performance on a range of classification problems. We show that surrogate gradient learning is robust to different shapes of underlying surrogate derivatives, but the choice of the derivative's scale can substantially affect learning performance. When we combine surrogate gradients with suitable activity regularization techniques, spiking networks perform robust information processing at the sparse activity limit. Our study provides a systematic account of the remarkable robustness of surrogate gradient learning and serves as a practical guide to model functional spiking neural networks.
大脑在尖峰神经网络中处理信息。它们复杂的连接方式塑造了这些网络执行的各种功能。然而,网络连接如何与功能相关还知之甚少,并且尖峰网络模型的功能能力仍然很初级。缺乏理论洞察力和寻找必要连接的实用算法,这对研究大脑中的信息处理和构建高效的神经形态硬件系统都构成了重大障碍。解决这个问题的人工神经网络的训练算法通常依赖于梯度下降。但是,由于尖峰的不可微非线性,在尖峰网络中做到这一点仍然具有挑战性。为了避免这个问题,可以使用替代梯度来发现所需的连接。然而,替代的选择不是唯一的,这就提出了一个问题,即替代的实现如何影响方法的有效性。在这里,我们使用数值模拟系统地研究了替代梯度的基本设计参数如何影响一系列分类问题的学习性能。我们表明,替代梯度学习对底层替代导数的不同形状具有鲁棒性,但导数的尺度选择会对学习性能产生实质性影响。当我们将替代梯度与合适的活动正则化技术结合使用时,尖峰网络在稀疏活动极限下能够稳健地进行信息处理。我们的研究提供了对替代梯度学习的显著鲁棒性的系统解释,并为模型功能尖峰神经网络提供了实用指南。