Alamia Andrea, Mozafari Milad, Choksi Bhavin, VanRullen Rufin
CerCo, CNRS, 31052 Toulouse, France; ANITI, Université de Toulouse, 31062, Toulouse, France.
CerCo, CNRS, 31052 Toulouse, France; IRIT, CNRS, 31062, Toulouse, France.
Neural Netw. 2023 Jan;157:280-287. doi: 10.1016/j.neunet.2022.10.020. Epub 2022 Oct 27.
Brain-inspired machine learning is gaining increasing consideration, particularly in computer vision. Several studies investigated the inclusion of top-down feedback connections in convolutional networks; however, it remains unclear how and when these connections are functionally helpful. Here we address this question in the context of object recognition under noisy conditions. We consider deep convolutional networks (CNNs) as models of feed-forward visual processing and implement Predictive Coding (PC) dynamics through feedback connections (predictive feedback) trained for reconstruction or classification of clean images. First, we show that the accuracy of the network implementing PC dynamics is significantly larger compared to its equivalent forward network. Importantly, to directly assess the computational role of predictive feedback in various experimental situations, we optimize and interpret the hyper-parameters controlling the network's recurrent dynamics. That is, we let the optimization process determine whether top-down connections and predictive coding dynamics are functionally beneficial. Across different model depths and architectures (3-layer CNN, ResNet18, and EfficientNetB0) and against various types of noise (CIFAR100-C), we find that the network increasingly relies on top-down predictions as the noise level increases; in deeper networks, this effect is most prominent at lower layers. All in all, our results provide novel insights relevant to Neuroscience by confirming the computational role of feedback connections in sensory systems, and to Machine Learning by revealing how these can improve the robustness of current vision models.
受大脑启发的机器学习越来越受到关注,尤其是在计算机视觉领域。多项研究探讨了在卷积网络中纳入自上而下的反馈连接;然而,这些连接在功能上如何以及何时发挥作用仍不清楚。在此,我们在噪声条件下的目标识别背景下解决这个问题。我们将深度卷积网络(CNN)视为前馈视觉处理模型,并通过为清晰图像的重建或分类训练的反馈连接(预测反馈)来实现预测编码(PC)动态。首先,我们表明,与等效的前向网络相比,实现PC动态的网络的准确率显著更高。重要的是,为了直接评估预测反馈在各种实验情况下的计算作用,我们优化并解释控制网络循环动态的超参数。也就是说,我们让优化过程来确定自上而下的连接和预测编码动态在功能上是否有益。在不同的模型深度和架构(3层CNN、ResNet18和EfficientNetB0)以及针对各种类型的噪声(CIFAR100-C)的情况下,我们发现随着噪声水平的增加,网络越来越依赖自上而下的预测;在更深的网络中,这种效应在较低层最为显著。总而言之,我们的结果通过确认反馈连接在感觉系统中的计算作用,为神经科学提供了新的见解,并通过揭示这些连接如何提高当前视觉模型的鲁棒性,为机器学习提供了新的见解。