VA Boston Healthcare System, MA.
Boston University School of Medicine, MA.
J Cogn Neurosci. 2022 Jul 1;34(8):1429-1446. doi: 10.1162/jocn_a_01873.
Simple probabilistic reinforcement learning is recognized as a striatum-based learning system, but in recent years, has also been associated with hippocampal involvement. This study examined whether such involvement may be attributed to observation-based learning (OL) processes, running in parallel to striatum-based reinforcement learning. A computational model of OL, mirroring classic models of reinforcement-based learning (RL), was constructed and applied to the neuroimaging data set of Palombo, Hayes, Reid, and Verfaellie [2019. Hippocampal contributions to value-based learning: Converging evidence from fMRI and amnesia. Cognitive, Affective & Behavioral Neuroscience, 19(3), 523-536]. Results suggested that OL processes may indeed take place concomitantly to reinforcement learning and involve activation of the hippocampus and central orbitofrontal cortex. However, rather than independent mechanisms running in parallel, the brain correlates of the OL and RL prediction errors indicated collaboration between systems, with direct implication of the hippocampus in computations of the discrepancy between the expected and actual reinforcing values of actions. These findings are consistent with previous accounts of a role for the hippocampus in encoding the strength of observed stimulus-outcome associations, with updating of such associations through striatal reinforcement-based computations. In addition, enhanced negative RL prediction error signaling was found in the anterior insula with greater use of OL over RL processes. This result may suggest an additional mode of collaboration between the OL and RL systems, implicating the error monitoring network.
简单的概率强化学习被认为是基于纹状体的学习系统,但近年来,也与海马体的参与有关。本研究探讨了这种参与是否可能归因于基于观察的学习 (OL) 过程,这些过程与基于纹状体的强化学习 (RL) 并行运行。构建了一个 OL 的计算模型,反映了基于 RL 的经典模型,该模型应用于 Palombo、Hayes、Reid 和 Verfaellie [2019. 海马体对基于价值的学习的贡献:来自 fMRI 和健忘症的趋同证据。认知、情感和行为神经科学,19(3),523-536] 的神经影像学数据集。结果表明,OL 过程确实可能与强化学习同时发生,并涉及海马体和中央眶额皮层的激活。然而,OL 和 RL 预测误差的大脑相关性表明系统之间存在协作,而不是并行运行的独立机制,海马体在计算动作的预期和实际强化值之间的差异方面起着直接作用。这些发现与海马体在编码观察到的刺激-结果关联强度方面的作用的先前解释一致,通过纹状体基于 RL 的计算来更新这些关联。此外,在使用 OL 多于 RL 过程时,前岛叶的负 RL 预测误差信号增强。这一结果可能表明 OL 和 RL 系统之间存在额外的协作模式,涉及错误监测网络。