Zhang Ying, Pan Xiaochuan, Wang Yihong
Institute for Cognitive Neurodynamics, East China University of Science and Technology, Shanghai, China.
Front Psychiatry. 2022 Oct 25;13:1008011. doi: 10.3389/fpsyt.2022.1008011. eCollection 2022.
It is known that humans and animals can learn and utilize category information quickly and efficiently to adapt to changing environments, and several brain areas are involved in learning and encoding category information. However, it is unclear that how the brain system learns and forms categorical representations from the view of neural circuits. In order to investigate this issue from the network level, we combine a recurrent neural network with reinforcement learning to construct a deep reinforcement learning model to demonstrate how the category is learned and represented in the network. The model consists of a policy network and a value network. The policy network is responsible for updating the policy to choose actions, while the value network is responsible for evaluating the action to predict rewards. The agent learns dynamically through the information interaction between the policy network and the value network. This model was trained to learn six stimulus-stimulus associative chains in a sequential paired-association task that was learned by the monkey. The simulated results demonstrated that our model was able to learn the stimulus-stimulus associative chains, and successfully reproduced the similar behavior of the monkey performing the same task. Two types of neurons were found in this model: one type primarily encoded identity information about individual stimuli; the other type mainly encoded category information of associated stimuli in one chain. The two types of activity-patterns were also observed in the primate prefrontal cortex after the monkey learned the same task. Furthermore, the ability of these two types of neurons to encode stimulus or category information was enhanced during this model was learning the task. Our results suggest that the neurons in the recurrent neural network have the ability to form categorical representations through deep reinforcement learning during learning stimulus-stimulus associations. It might provide a new approach for understanding neuronal mechanisms underlying how the prefrontal cortex learns and encodes category information.
众所周知,人类和动物能够快速有效地学习和利用类别信息以适应不断变化的环境,并且有几个脑区参与学习和编码类别信息。然而,从神经回路的角度来看,大脑系统如何学习并形成类别表征尚不清楚。为了从网络层面研究这个问题,我们将循环神经网络与强化学习相结合,构建了一个深度强化学习模型,以展示类别在网络中是如何被学习和表征的。该模型由一个策略网络和一个价值网络组成。策略网络负责更新策略以选择动作,而价值网络负责评估动作以预测奖励。智能体通过策略网络和价值网络之间的信息交互进行动态学习。这个模型在猴子学习的顺序配对联想任务中被训练来学习六条刺激-刺激联想链。模拟结果表明,我们的模型能够学习刺激-刺激联想链,并成功重现了猴子执行相同任务时的类似行为。在这个模型中发现了两种类型的神经元:一种主要编码关于单个刺激的身份信息;另一种主要编码一条链中相关刺激的类别信息。在猴子学习相同任务后,在灵长类前额叶皮层中也观察到了这两种活动模式。此外,在这个模型学习任务的过程中,这两种类型的神经元编码刺激或类别信息的能力得到了增强。我们的结果表明,循环神经网络中的神经元在学习刺激-刺激关联的过程中具有通过深度强化学习形成类别表征的能力。这可能为理解前额叶皮层如何学习和编码类别信息的神经元机制提供一种新方法。