Sporns Olaf, Alexander William H
Neural Netw. 2002 Jun-Jul;15(4-6):761-74. doi: 10.1016/s0893-6080(02)00062-x.
In this paper we implement a computational model of a neuromodulatory system in an autonomous robot. The output of the neuromodulatory system acts as a value signal, modulating widely distributed synaptic changes. The model is based on anatomical and physiological properties of midbrain diffuse ascending systems, in particular parts of the dopamine and noradrenaline systems. During reward conditioning, the model learns to generate tonic and phasic signals that represent predictions and prediction errors, including precisely timed negative signals if expected rewards are omitted or delayed. We test the robot's learning and behavior in different environmental contexts and observe changes in the development of the neuromodulatory system that depend upon environmental factors. Simulation of a computational model incorporating both reward-related and aversive stimuli leads to the emergence of conditioned reward and aversive behaviors. These studies represent a step towards investigating computational aspects of neuromodulatory systems in autonomous robots.
在本文中,我们在一个自主机器人中实现了一个神经调节系统的计算模型。神经调节系统的输出充当一个价值信号,调节广泛分布的突触变化。该模型基于中脑弥散性上行系统的解剖学和生理学特性,特别是多巴胺和去甲肾上腺素系统的部分特性。在奖励条件作用期间,该模型学习生成代表预测和预测误差的紧张性和相位性信号,包括如果预期奖励被省略或延迟时精确计时的负信号。我们在不同的环境背景下测试机器人的学习和行为,并观察到神经调节系统的发育变化取决于环境因素。结合奖励相关和厌恶刺激的计算模型的模拟导致了条件性奖励和厌恶行为的出现。这些研究代表了朝着研究自主机器人中神经调节系统的计算方面迈出的一步。