Suppr超能文献

学习世界的结构:多阶段决策中状态空间和动作表示的适应性。

Learning the structure of the world: The adaptive nature of state-space and action representations in multi-stage decision-making.

机构信息

School of Psychology, UNSW, Sydney, Australia.

Data61, CSIRO, Sydney, Australia.

出版信息

PLoS Comput Biol. 2019 Sep 6;15(9):e1007334. doi: 10.1371/journal.pcbi.1007334. eCollection 2019 Sep.

Abstract

State-space and action representations form the building blocks of decision-making processes in the brain; states map external cues to the current situation of the agent whereas actions provide the set of motor commands from which the agent can choose to achieve specific goals. Although these factors differ across environments, it is currently unknown whether or how accurately state and action representations are acquired by the agent because previous experiments have typically provided this information a priori through instruction or pre-training. Here we studied how state and action representations adapt to reflect the structure of the world when such a priori knowledge is not available. We used a sequential decision-making task in rats in which they were required to pass through multiple states before reaching the goal, and for which the number of states and how they map onto external cues were unknown a priori. We found that, early in training, animals selected actions as if the task was not sequential and outcomes were the immediate consequence of the most proximal action. During the course of training, however, rats recovered the true structure of the environment and made decisions based on the expanded state-space, reflecting the multiple stages of the task. Similarly, we found that the set of actions expanded with training, although the emergence of new action sequences was sensitive to the experimental parameters and specifics of the training procedure. We conclude that the profile of choices shows a gradual shift from simple representations to more complex structures compatible with the structure of the world.

摘要

状态空间和动作表示构成了大脑决策过程的基础;状态将外部线索映射到代理的当前情况,而动作则提供了一组运动命令,代理可以从中选择以实现特定目标。尽管这些因素在不同的环境中有所不同,但目前尚不清楚代理是否以及如何准确地获取状态和动作表示,因为以前的实验通常通过指令或预训练预先提供了这些信息。在这里,我们研究了在没有先验知识的情况下,状态和动作表示如何适应反映世界结构。我们使用了一种在大鼠中进行的序列决策任务,在该任务中,它们需要通过多个状态才能到达目标,并且状态的数量以及它们与外部线索的映射方式是事先不知道的。我们发现,在训练的早期,动物选择的动作就好像任务不是序列的,并且结果是最接近的动作的直接结果。然而,在训练过程中,大鼠恢复了环境的真实结构,并根据扩展的状态空间做出决策,反映了任务的多个阶段。同样,我们发现随着训练的进行,动作集扩大了,尽管新动作序列的出现对实验参数和训练过程的具体细节很敏感。我们得出结论,选择的分布显示出从简单表示到更复杂结构的逐渐转变,与世界的结构兼容。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4fd0/6750884/b6634f656eeb/pcbi.1007334.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验