在复杂空间环境中，双重学分分配过程是多巴胺信号的基础。

Dual credit assignment processes underlie dopamine signals in a complex spatial environment.

作者信息

Krausz Timothy A, Comrie Alison E, Frank Loren M, Daw Nathaniel D, Berke Joshua D

机构信息

Neuroscience Graduate Program, University of California, San Francisco.

Kavli Institute for Fundamental Neuroscience, and Weill Institute for Neurosciences, UCSF.

出版信息

bioRxiv. 2023 Mar 19:2023.02.15.528738. doi: 10.1101/2023.02.15.528738.

DOI:10.1101/2023.02.15.528738

PMID:36993482

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10054934/

Abstract

Dopamine in the nucleus accumbens helps motivate behavior based on expectations of future reward ("values"). These values need to be updated by experience: after receiving reward, the choices that led to reward should be assigned greater value. There are multiple theoretical proposals for how this credit assignment could be achieved, but the specific algorithms that generate updated dopamine signals remain uncertain. We monitored accumbens dopamine as freely behaving rats foraged for rewards in a complex, changing environment. We observed brief pulses of dopamine both when rats received reward (scaling with prediction error), and when they encountered novel path opportunities. Furthermore, dopamine ramped up as rats ran towards reward ports, in proportion to the value at each location. By examining the evolution of these dopamine place-value signals, we found evidence for two distinct update processes: progressive propagation along taken paths, as in temporal-difference learning, and inference of value throughout the maze, using internal models. Our results demonstrate that within rich, naturalistic environments dopamine conveys place values that are updated via multiple, complementary learning algorithms.

摘要

伏隔核中的多巴胺有助于根据对未来奖励（“价值”）的期望来激发行为。这些价值需要通过经验进行更新：在获得奖励后，导致奖励的选择应被赋予更高的价值。关于如何实现这种信用分配有多种理论提议，但产生更新的多巴胺信号的具体算法仍不确定。我们在自由活动的大鼠于复杂多变的环境中觅食奖励时监测了伏隔核多巴胺。我们观察到，当大鼠获得奖励时（与预测误差成比例）以及当它们遇到新的路径机会时，多巴胺都会出现短暂脉冲。此外，当大鼠跑向奖励端口时，多巴胺会随着每个位置的价值成比例增加。通过研究这些多巴胺位置价值信号的演变，我们发现了两种不同更新过程的证据：如在时间差分学习中那样沿着所走路径进行渐进传播，以及使用内部模型在整个迷宫中推断价值。我们的结果表明，在丰富的自然环境中，多巴胺传达的位置价值是通过多种互补学习算法进行更新的。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

在复杂空间环境中，双重学分分配过程是多巴胺信号的基础。

Dual credit assignment processes underlie dopamine signals in a complex spatial environment.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

在复杂空间环境中，双重学分分配过程是多巴胺信号的基础。

Dual credit assignment processes underlie dopamine signals in a complex spatial environment.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献