Suppr超能文献

一种具有生物学合理性的动作发现体现模型。

A biologically plausible embodied model of action discovery.

机构信息

Department of Psychology, Adaptive Behaviour Research Group, University of Sheffield Sheffield, UK.

出版信息

Front Neurorobot. 2013 Mar 12;7:4. doi: 10.3389/fnbot.2013.00004. eCollection 2013.

Abstract

During development, animals can spontaneously discover action-outcome pairings enabling subsequent achievement of their goals. We present a biologically plausible embodied model addressing key aspects of this process. The biomimetic model core comprises the basal ganglia and its loops through cortex and thalamus. We incorporate reinforcement learning (RL) with phasic dopamine supplying a sensory prediction error, signalling "surprising" outcomes. Phasic dopamine is used in a cortico-striatal learning rule which is consistent with recent data. We also hypothesized that objects associated with surprising outcomes acquire "novelty salience" contingent on the predicability of the outcome. To test this idea we used a simple model of prediction governing the dynamics of novelty salience and phasic dopamine. The task of the virtual robotic agent mimicked an in vivo counterpart (Gancarz et al., 2011) and involved interaction with a target object which caused a light flash, or a control object which did not. Learning took place according to two schedules. In one, the phasic outcome was delivered after interaction with the target in an unpredictable way which emulated the in vivo protocol. Without novelty salience, the model was unable to account for the experimental data. In the other schedule, the phasic outcome was reliably delivered and the agent showed a rapid increase in the number of interactions with the target which then decreased over subsequent sessions. We argue this is precisely the kind of change in behavior required to repeatedly present representations of context, action and outcome, to neural networks responsible for learning action-outcome contingency. The model also showed cortico-striatal plasticity consistent with learning a new action in basal ganglia. We conclude that action learning is underpinned by a complex interplay of plasticity and stimulus salience, and that our model contains many of the elements for biological action discovery to take place.

摘要

在发育过程中,动物可以自发地发现动作-结果配对,从而实现后续目标。我们提出了一个具有生物合理性的体现模型,解决了这个过程的关键方面。仿生模型的核心包括基底神经节及其通过皮层和丘脑的循环。我们将强化学习(RL)与相位多巴胺结合使用,相位多巴胺提供感官预测误差,表明“意外”的结果。相位多巴胺用于皮质-纹状体学习规则,该规则与最近的数据一致。我们还假设,与意外结果相关的物体根据结果的可预测性获得“新颖性显着性”。为了检验这个想法,我们使用了一个简单的预测模型来控制新颖性显着性和相位多巴胺的动力学。虚拟机器人代理的任务模拟了体内对应物(Gancarz 等人,2011 年),并涉及与目标物体的交互作用,该目标物体导致光闪烁,或者与控制物体交互不产生光闪烁。学习是根据两个时间表进行的。在一个时间表中,相位结果在与目标以不可预测的方式交互后交付,这模拟了体内方案。没有新颖性显着性,该模型无法解释实验数据。在另一个时间表中,相位结果可靠地交付,代理与目标的交互次数迅速增加,然后在随后的会话中减少。我们认为,正是这种行为变化需要反复呈现上下文、动作和结果的表示,以便为负责学习动作-结果相关性的神经网络提供。该模型还显示出与基底神经节中学习新动作一致的皮质-纹状体可塑性。我们得出结论,动作学习是由可塑性和刺激显着性的复杂相互作用支撑的,并且我们的模型包含了许多发生生物动作发现的元素。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/29d8/3594743/3d312d3136fe/fnbot-07-00004-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验