Suppr超能文献

一种乘法强化学习模型,可捕捉小鼠中的学习动态和个体间变异性。

A multiplicative reinforcement learning model capturing learning dynamics and interindividual variability in mice.

机构信息

Research Institute of Molecular Pathology, 1030 Vienna, Austria.

出版信息

Proc Natl Acad Sci U S A. 2013 Dec 3;110(49):19950-5. doi: 10.1073/pnas.1312125110. Epub 2013 Nov 19.

Abstract

Both in humans and in animals, different individuals may learn the same task with strikingly different speeds; however, the sources of this variability remain elusive. In standard learning models, interindividual variability is often explained by variations of the learning rate, a parameter indicating how much synapses are updated on each learning event. Here, we theoretically show that the initial connectivity between the neurons involved in learning a task is also a strong determinant of how quickly the task is learned, provided that connections are updated in a multiplicative manner. To experimentally test this idea, we trained mice to perform an auditory Go/NoGo discrimination task followed by a reversal to compare learning speed when starting from naive or already trained synaptic connections. All mice learned the initial task, but often displayed sigmoid-like learning curves, with a variable delay period followed by a steep increase in performance, as often observed in operant conditioning. For all mice, learning was much faster in the subsequent reversal training. An accurate fit of all learning curves could be obtained with a reinforcement learning model endowed with a multiplicative learning rule, but not with an additive rule. Surprisingly, the multiplicative model could explain a large fraction of the interindividual variability by variations in the initial synaptic weights. Altogether, these results demonstrate the power of multiplicative learning rules to account for the full dynamics of biological learning and suggest an important role of initial wiring in the brain for predispositions to different tasks.

摘要

在人类和动物中,不同个体可能以惊人不同的速度学习相同的任务;然而,这种可变性的来源仍然难以捉摸。在标准学习模型中,个体间的可变性通常可以通过学习率的变化来解释,学习率是一个表示每个学习事件中突触更新多少的参数。在这里,我们从理论上表明,参与学习任务的神经元之间的初始连接也是决定任务学习速度的一个重要因素,前提是连接以乘法方式更新。为了实验验证这一想法,我们训练老鼠执行听觉 Go/NoGo 辨别任务,然后进行反转,比较从幼稚或已经训练过的突触连接开始时的学习速度。所有的老鼠都学会了初始任务,但通常表现出类 S 型的学习曲线,有一个可变的延迟期,随后是性能的急剧增加,这在操作性条件反射中经常观察到。对于所有老鼠来说,在随后的反转训练中,学习速度要快得多。一个带有乘法学习规则的强化学习模型可以准确地拟合所有的学习曲线,但不能用加法规则。令人惊讶的是,乘法模型可以通过初始突触权重的变化来解释很大一部分个体间的可变性。总之,这些结果表明乘法学习规则在解释生物学习的全部动态方面具有强大的能力,并表明初始连接在大脑中对不同任务的倾向具有重要作用。

相似文献

10
Exploration biases forelimb reaching strategies.探索偏见影响前肢伸展策略。
Cell Rep. 2024 Apr 23;43(4):113958. doi: 10.1016/j.celrep.2024.113958. Epub 2024 Mar 22.

引用本文的文献

9
Striatal low-threshold spiking interneurons locally gate dopamine.纹状体中阈下尖峰放电中间神经元局部控制多巴胺。
Curr Biol. 2021 Sep 27;31(18):4139-4147.e6. doi: 10.1016/j.cub.2021.06.081. Epub 2021 Jul 23.

本文引用的文献

1
Increased axonal bouton dynamics in the aging mouse cortex.衰老小鼠皮层中轴突末梢动态增加。
Proc Natl Acad Sci U S A. 2013 Apr 16;110(16):E1514-23. doi: 10.1073/pnas.1218731110. Epub 2013 Mar 29.
4
Early life manipulations alter learning and memory in rats.早期生活的人为操作会改变大鼠的学习和记忆能力。
Neurosci Biobehav Rev. 2012 Oct;36(9):1985-2006. doi: 10.1016/j.neubiorev.2012.07.003. Epub 2012 Jul 20.
6
Attention, the presolution period, and choice accuracy in pigeons.鸽子的注意力、预解决期和选择准确性。
Behav Processes. 2012 Mar;89(3):225-31. doi: 10.1016/j.beproc.2011.11.003. Epub 2011 Nov 29.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验