一种乘法强化学习模型，可捕捉小鼠中的学习动态和个体间变异性。

A multiplicative reinforcement learning model capturing learning dynamics and interindividual variability in mice.

机构信息

Research Institute of Molecular Pathology, 1030 Vienna, Austria.

出版信息

Proc Natl Acad Sci U S A. 2013 Dec 3;110(49):19950-5. doi: 10.1073/pnas.1312125110. Epub 2013 Nov 19.

DOI:10.1073/pnas.1312125110

PMID:24255115

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3856837/

Abstract

Both in humans and in animals, different individuals may learn the same task with strikingly different speeds; however, the sources of this variability remain elusive. In standard learning models, interindividual variability is often explained by variations of the learning rate, a parameter indicating how much synapses are updated on each learning event. Here, we theoretically show that the initial connectivity between the neurons involved in learning a task is also a strong determinant of how quickly the task is learned, provided that connections are updated in a multiplicative manner. To experimentally test this idea, we trained mice to perform an auditory Go/NoGo discrimination task followed by a reversal to compare learning speed when starting from naive or already trained synaptic connections. All mice learned the initial task, but often displayed sigmoid-like learning curves, with a variable delay period followed by a steep increase in performance, as often observed in operant conditioning. For all mice, learning was much faster in the subsequent reversal training. An accurate fit of all learning curves could be obtained with a reinforcement learning model endowed with a multiplicative learning rule, but not with an additive rule. Surprisingly, the multiplicative model could explain a large fraction of the interindividual variability by variations in the initial synaptic weights. Altogether, these results demonstrate the power of multiplicative learning rules to account for the full dynamics of biological learning and suggest an important role of initial wiring in the brain for predispositions to different tasks.

摘要

在人类和动物中，不同个体可能以惊人不同的速度学习相同的任务；然而，这种可变性的来源仍然难以捉摸。在标准学习模型中，个体间的可变性通常可以通过学习率的变化来解释，学习率是一个表示每个学习事件中突触更新多少的参数。在这里，我们从理论上表明，参与学习任务的神经元之间的初始连接也是决定任务学习速度的一个重要因素，前提是连接以乘法方式更新。为了实验验证这一想法，我们训练老鼠执行听觉 Go/NoGo 辨别任务，然后进行反转，比较从幼稚或已经训练过的突触连接开始时的学习速度。所有的老鼠都学会了初始任务，但通常表现出类 S 型的学习曲线，有一个可变的延迟期，随后是性能的急剧增加，这在操作性条件反射中经常观察到。对于所有老鼠来说，在随后的反转训练中，学习速度要快得多。一个带有乘法学习规则的强化学习模型可以准确地拟合所有的学习曲线，但不能用加法规则。令人惊讶的是，乘法模型可以通过初始突触权重的变化来解释很大一部分个体间的可变性。总之，这些结果表明乘法学习规则在解释生物学习的全部动态方面具有强大的能力，并表明初始连接在大脑中对不同任务的倾向具有重要作用。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

一种乘法强化学习模型，可捕捉小鼠中的学习动态和个体间变异性。

A multiplicative reinforcement learning model capturing learning dynamics and interindividual variability in mice.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

相似文献

引用本文的文献

本文引用的文献

一种乘法强化学习模型，可捕捉小鼠中的学习动态和个体间变异性。

A multiplicative reinforcement learning model capturing learning dynamics and interindividual variability in mice.

机构信息

出版信息