Medical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, United Kingdom.
Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands.
PLoS Comput Biol. 2020 Oct 2;16(10):e1008215. doi: 10.1371/journal.pcbi.1008215. eCollection 2020 Oct.
Deep feedforward neural network models of vision dominate in both computational neuroscience and engineering. The primate visual system, by contrast, contains abundant recurrent connections. Recurrent signal flow enables recycling of limited computational resources over time, and so might boost the performance of a physically finite brain or model. Here we show: (1) Recurrent convolutional neural network models outperform feedforward convolutional models matched in their number of parameters in large-scale visual recognition tasks on natural images. (2) Setting a confidence threshold, at which recurrent computations terminate and a decision is made, enables flexible trading of speed for accuracy. At a given confidence threshold, the model expends more time and energy on images that are harder to recognise, without requiring additional parameters for deeper computations. (3) The recurrent model's reaction time for an image predicts the human reaction time for the same image better than several parameter-matched and state-of-the-art feedforward models. (4) Across confidence thresholds, the recurrent model emulates the behaviour of feedforward control models in that it achieves the same accuracy at approximately the same computational cost (mean number of floating-point operations). However, the recurrent model can be run longer (higher confidence threshold) and then outperforms parameter-matched feedforward comparison models. These results suggest that recurrent connectivity, a hallmark of biological visual systems, may be essential for understanding the accuracy, flexibility, and dynamics of human visual recognition.
深度前馈神经网络模型在计算神经科学和工程领域占据主导地位。相比之下,灵长类动物的视觉系统包含丰富的递归连接。递归信号流使有限的计算资源能够随时间重复利用,从而可能提高物理上有限的大脑或模型的性能。在这里,我们展示了:(1)在大规模视觉识别任务中,与参数匹配的前馈卷积模型相比,递归卷积神经网络模型在自然图像上的表现更好。(2)设置置信度阈值,在该阈值处,递归计算终止并做出决策,从而能够灵活地在速度和准确性之间进行权衡。在给定的置信度阈值下,该模型在识别困难的图像上花费更多的时间和精力,而不需要更深层次计算的额外参数。(3)模型对图像的反应时间比几个参数匹配和最先进的前馈模型对同一图像的反应时间预测更好。(4)在置信度阈值内,递归模型的行为与前馈控制模型相似,即在大致相同的计算成本(浮点运算的平均数量)下达到相同的准确性。然而,递归模型可以运行更长时间(更高的置信度阈值),并且优于参数匹配的前馈比较模型。这些结果表明,递归连接是生物视觉系统的一个标志,对于理解人类视觉识别的准确性、灵活性和动态性可能至关重要。