通过具有多巴胺样强化信号的神经网络模型学习连续运动。

Learning of sequential movements by neural network model with dopamine-like reinforcement signal.

作者信息

Suri R E, Schultz W

机构信息

Institute of Physiology, University of Fribourg, Switzerland.

出版信息

Exp Brain Res. 1998 Aug;121(3):350-4. doi: 10.1007/s002210050467.

DOI:10.1007/s002210050467

PMID:9746140

Abstract

Dopamine neurons appear to code an error in the prediction of reward. They are activated by unpredicted rewards, are not influenced by predicted rewards, and are depressed when a predicted reward is omitted. After conditioning, they respond to reward-predicting stimuli in a similar manner. With these characteristics, the dopamine response strongly resembles the predictive reinforcement teaching signal of neural network models implementing the temporal difference learning algorithm. This study explored a neural network model that used a reward-prediction error signal strongly resembling dopamine responses for learning movement sequences. A different stimulus was presented in each step of the sequence and required a different movement reaction, and reward occurred at the end of the correctly performed sequence. The dopamine-like predictive reinforcement signal efficiently allowed the model to learn long sequences. By contrast, learning with an unconditional reinforcement signal required synaptic eligibility traces of longer and biologically less-plausible durations for obtaining satisfactory performance. Thus, dopamine-like neuronal signals constitute excellent teaching signals for learning sequential behavior.

摘要

多巴胺神经元似乎对奖励预测中的误差进行编码。它们会被意外的奖励激活，不受预期奖励的影响，而当预期奖励被省略时则会受到抑制。经过条件作用后，它们以类似的方式对奖励预测刺激做出反应。基于这些特性，多巴胺反应与实施时间差分学习算法的神经网络模型的预测强化教学信号极为相似。本研究探索了一种神经网络模型，该模型使用与多巴胺反应极为相似的奖励预测误差信号来学习运动序列。序列的每个步骤呈现不同的刺激，并需要不同的运动反应，且奖励出现在正确执行序列的末尾。类似多巴胺的预测强化信号有效地使模型能够学习长序列。相比之下，使用无条件强化信号进行学习需要更长且生物学上不太合理时长的突触资格痕迹才能获得令人满意的表现。因此，类似多巴胺的神经元信号构成了用于学习序列行为的极佳教学信号。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

通过具有多巴胺样强化信号的神经网络模型学习连续运动。

Learning of sequential movements by neural network model with dopamine-like reinforcement signal.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

通过具有多巴胺样强化信号的神经网络模型学习连续运动。

Learning of sequential movements by neural network model with dopamine-like reinforcement signal.

作者信息

机构信息

出版信息

相似文献

引用本文的文献