Liebana Samuel, Laffere Aeron, Toschi Chiara, Schilling Louisa, Moretti Jessica, Podlaski Jacek, Fritsche Matthias, Zatka-Haas Peter, Li Yulong, Bogacz Rafal, Saxe Andrew, Lak Armin
Department of Physiology, Anatomy & Genetics, University of Oxford, Oxford OX1 3PT, UK.
Department of Physiology, Anatomy & Genetics, University of Oxford, Oxford OX1 3PT, UK.
Cell. 2025 Jul 10;188(14):3789-3805.e33. doi: 10.1016/j.cell.2025.05.025. Epub 2025 Jun 11.
Striatal dopamine plays fundamental roles in fine-tuning learned decisions. However, when learning from naive to expert, individuals often exhibit diverse learning trajectories, defying understanding of its underlying dopaminergic mechanisms. Here, we longitudinally measure and manipulate dorsal striatal dopamine signals in mice learning a decision task from naive to expert. Mice learning trajectories transitioned through sequences of strategies, showing substantial individual diversity. Remarkably, the transitions were systematic; each mouse's early strategy determined its strategy weeks later. Dopamine signals reflected strategies each animal transitioned through, encoding a subset of stimulus-choice associations. Optogenetic manipulations selectively updated these associations, leading to learning effects distinct from that of reward. A deep neural network using heterogeneous teaching signals, each updating a subset of network association weights, captured our results. Analyzing the model's fixed points explained learning diversity and systematicity. Altogether, this work provides insights into the biological and mathematical principles underlying individual long-term learning trajectories.
纹状体多巴胺在微调习得决策中起着基础性作用。然而,当从新手学习到专家水平时,个体往往表现出多样的学习轨迹,这使得对其潜在多巴胺能机制的理解变得困难。在这里,我们纵向测量并操纵了小鼠在从新手到专家水平学习决策任务过程中的背侧纹状体多巴胺信号。小鼠的学习轨迹通过一系列策略进行转变,表现出显著的个体差异。值得注意的是,这些转变是系统的;每只小鼠早期的策略决定了其数周后的策略。多巴胺信号反映了每只动物所经历的策略转变,编码了刺激 - 选择关联的一个子集。光遗传学操纵选择性地更新了这些关联,导致了与奖励不同的学习效果。一个使用异质教学信号的深度神经网络,每个信号更新网络关联权重的一个子集,捕捉到了我们的结果。分析模型的不动点解释了学习的多样性和系统性。总之,这项工作为个体长期学习轨迹背后的生物学和数学原理提供了见解。