Suppr超能文献

人类强化学习中的价值归一化的函数形式。

The functional form of value normalization in human reinforcement learning.

机构信息

Laboratoire de Neurosciences Cognitives et Computationnelles, Institut National de la Santé et Recherche Médicale, Paris, France.

Département d'Etudes Cognitives, Ecole Normale Supérieure, PSL University, Paris, France.

出版信息

Elife. 2023 Jul 10;12:e83891. doi: 10.7554/eLife.83891.

Abstract

Reinforcement learning research in humans and other species indicates that rewards are represented in a context-dependent manner. More specifically, reward representations seem to be normalized as a function of the value of the alternative options. The dominant view postulates that value context-dependence is achieved via a divisive normalization rule, inspired by perceptual decision-making research. However, behavioral and neural evidence points to another plausible mechanism: range normalization. Critically, previous experimental designs were ill-suited to disentangle the divisive and the range normalization accounts, which generate similar behavioral predictions in many circumstances. To address this question, we designed a new learning task where we manipulated, across learning contexts, the number of options and the value ranges. Behavioral and computational analyses falsify the divisive normalization account and rather provide support for the range normalization rule. Together, these results shed new light on the computational mechanisms underlying context-dependence in learning and decision-making.

摘要

人类和其他物种的强化学习研究表明,奖励是以依赖于上下文的方式表示的。更具体地说,奖励表示似乎被归一化为替代选项价值的函数。主流观点假设,价值的上下文依赖性是通过基于感知决策研究的除法归一化规则实现的。然而,行为和神经证据指向另一种可能的机制:范围归一化。关键是,以前的实验设计不适合区分除法和范围归一化解释,在许多情况下,这两种解释都会产生相似的行为预测。为了解决这个问题,我们设计了一个新的学习任务,在这个任务中,我们在学习环境中跨多个选项和价值范围进行操作。行为和计算分析否定了除法归一化解释,而是为范围归一化规则提供了支持。这些结果共同揭示了学习和决策过程中上下文依赖性的计算机制。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9905/10393293/074bca6ef617/elife-83891-fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验