Suppr超能文献

人类大脑中反向强化学习的神经计算。

Neural computations underlying inverse reinforcement learning in the human brain.

机构信息

Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, United States.

Computation and Neural Systems Program, California Institute of Technology, Pasadena, United States.

出版信息

Elife. 2017 Oct 30;6:e29718. doi: 10.7554/eLife.29718.

Abstract

In inverse reinforcement learning an observer infers the reward distribution available for actions in the environment solely through observing the actions implemented by another agent. To address whether this computational process is implemented in the human brain, participants underwent fMRI while learning about slot machines yielding hidden preferred and non-preferred food outcomes with varying probabilities, through observing the repeated slot choices of agents with similar and dissimilar food preferences. Using formal model comparison, we found that participants implemented inverse RL as opposed to a simple imitation strategy, in which the actions of the other agent are copied instead of inferring the underlying reward structure of the decision problem. Our computational fMRI analysis revealed that anterior dorsomedial prefrontal cortex encoded inferences about action-values within the value space of the agent as opposed to that of the observer, demonstrating that inverse RL is an abstract cognitive process divorceable from the values and concerns of the observer him/herself.

摘要

在逆强化学习中,观察者仅通过观察另一个代理执行的操作,就能推断出环境中可用的奖励分布。为了解决这个计算过程是否在人类大脑中实现的问题,参与者在 fMRI 扫描中观察具有相似和不同食物偏好的代理重复选择老虎机,从而了解产生隐藏的偏好和非偏好食物结果的老虎机,其概率各不相同。通过正式的模型比较,我们发现参与者实施了逆强化学习,而不是简单的模仿策略,在模仿策略中,会复制其他代理的操作,而不是推断决策问题的潜在奖励结构。我们的计算 fMRI 分析表明,前背侧前额叶皮层在代理的价值空间内对动作值进行推断,而不是在观察者的价值空间内进行推断,这表明逆强化学习是一种抽象的认知过程,可以与观察者自身的价值观和关注点分离。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6dbe/5662289/703b91987487/elife-29718-fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验