用于机器人单试次学习的不断发展的基于尖峰时间依赖性的可塑性。

Evolving spike-timing-dependent plasticity for single-trial learning in robots.

作者信息

Di Paolo Ezequiel A

机构信息

School of Cognitive and Computing Sciences, University of Sussex, Brighton BN1 9QH, UK.

出版信息

Philos Trans A Math Phys Eng Sci. 2003 Oct 15;361(1811):2299-319. doi: 10.1098/rsta.2003.1256.

DOI:10.1098/rsta.2003.1256

PMID:14599321

Abstract

Single-trial learning is studied in an evolved robot model of synaptic spike-timing-dependent plasticity (STDP). Robots must perform positive phototaxis but must learn to perform negative phototaxis in the presence of a short-lived aversive sound stimulus. STDP acts at the millisecond range and depends asymmetrically on the relative timing of pre- and post-synaptic spikes. Although it has been involved in learning models of input prediction, these models require the iterated presentation of the input pattern, and it is hard to see how this mechanism could sustain single-trial learning over a time-scale of tens of seconds. An incremental evolutionary approach is used to answer this question. The evolved robots succeed in learning the appropriate behaviour, but learning does not depend on achieving the right synaptic configuration but rather the right pattern of neural activity. Robot performance during positive phototaxis is quite robust to loss of spike-timing information, but in contrast, this loss is catastrophic for learning negative phototaxis where entrained firing is common. Tests show that the final weight configuration carries no information about whether a robot is performing one behaviour or the other. Fixing weights, however, has the effect of degrading performance, thus demonstrating that plasticity is used to sustain the neural activity corresponding both to the normal phototaxis condition and to the learned behaviour. The implications and limitations of this result are discussed.

摘要

在一个基于突触脉冲时间依赖可塑性（STDP）的进化机器人模型中研究了单试学习。机器人必须执行正向趋光性，但在存在短暂厌恶声音刺激的情况下必须学会执行负向趋光性。STDP作用于毫秒级范围，并且不对称地依赖于突触前和突触后脉冲的相对时间。尽管它已被用于输入预测的学习模型中，但这些模型需要重复呈现输入模式，并且很难看出这种机制如何能在数十秒的时间尺度上维持单试学习。采用了一种增量进化方法来回答这个问题。进化后的机器人成功学会了适当的行为，但学习并不依赖于实现正确的突触配置，而是依赖于正确的神经活动模式。在正向趋光性过程中，机器人的表现对脉冲时间信息的丢失相当稳健，但相比之下，这种丢失对于学习负向趋光性来说是灾难性的，因为在负向趋光性中同步放电很常见。测试表明，最终的权重配置不携带关于机器人是在执行一种行为还是另一种行为的信息。然而，固定权重会导致性能下降，从而表明可塑性被用于维持与正常趋光性条件和学习行为相对应的神经活动。讨论了这一结果所具有的意义和局限性。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于机器人单试次学习的不断发展的基于尖峰时间依赖性的可塑性。

Evolving spike-timing-dependent plasticity for single-trial learning in robots.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

用于机器人单试次学习的不断发展的基于尖峰时间依赖性的可塑性。

Evolving spike-timing-dependent plasticity for single-trial learning in robots.

作者信息

机构信息

出版信息

相似文献

引用本文的文献