• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

闭环行为系统中的各向同性序列顺序学习

Isotropic-sequence-order learning in a closed-loop behavioural system.

作者信息

Porr Bernd, Wörgötter Florentin

机构信息

Department of Psychology, University of Stirling, Stirling FK9 4LA, UK.

出版信息

Philos Trans A Math Phys Eng Sci. 2003 Oct 15;361(1811):2225-44. doi: 10.1098/rsta.2003.1273.

DOI:10.1098/rsta.2003.1273
PMID:14599317
Abstract

The simplest form of sensor-motor control is obtained with a reflex. In this case the reflex can be interpreted as part of a closed-loop control paradigm which measures a sensor input and generates a motor reaction as soon as the sensor signal deviates from its desired (resting) state. This is a typical case of feedback control. However, reflex reactions are tardy, because they occur always only after a (for example, unpleasant) reflex-eliciting sensor event. This defines an objective problem for an organism which can only be avoided if the corresponding motor reaction is generated earlier. The goal of this study is to design a closed-loop control situation where temporal-sequence learning supersedes a tardy reflex reaction with a proactive anticipatory action. We achieve this by employing a second, earlier-occurring and causally coupled sensor event. An appropriate motor reaction to this early event prevents triggering of the original, primary reflex. Such causally coupled sensor events are common for animals, for example when smell predicts taste or when heat radiation precedes pain. We show that trying to achieve anticipatory control is a fundamentally different goal from trying to model a classical conditioning paradigm, which is an open-loop condition. To this end, we use a novel learning rule for temporal-sequence learning called isotropic-sequence-order (ISO) learning, which performs a confounded correlation between the primary sensor signal associated to the reflex and a predictive, earlier-occurring sensor input: this way the system learns the relation between the primary reflex and the earlier sensor input in order to create an earlier-occurring motor reaction. As a consequence of learning, the primary reflex will not be triggered any more, thereby permanently remaining in its desired resting state. In a robot application, we demonstrate that ISO learning can successfully solve the classical obstacle-avoidance task by learning to correlate a built-in reflex behaviour (retraction after touching) with earlier arising signals from range finders (before touching). Finally, we show that avoidance and attraction tasks can be combined in the same agent.

摘要

最简单的感觉运动控制形式是通过反射实现的。在这种情况下,反射可被解释为闭环控制范式的一部分,该范式测量感觉输入,并在感觉信号偏离其期望(静止)状态时立即产生运动反应。这是反馈控制的典型例子。然而,反射反应是迟缓的,因为它们总是在(例如,不愉快的)引发反射的感觉事件之后才发生。这给生物体带来了一个客观问题,只有在更早地产生相应的运动反应时才能避免。本研究的目标是设计一种闭环控制情境,其中时间序列学习通过主动的预期行动取代迟缓的反射反应。我们通过采用第二个更早发生且因果相关的感觉事件来实现这一点。对这个早期事件的适当运动反应可防止触发原始的主要反射。这种因果相关的感觉事件在动物中很常见,例如当气味预示味道时,或者当热辐射先于疼痛出现时。我们表明,试图实现预期控制与试图模拟经典条件作用范式(一种开环条件)是根本不同的目标。为此,我们使用一种用于时间序列学习的新颖学习规则,称为各向同性序列顺序(ISO)学习,它在与反射相关的主要感觉信号和预测性的更早出现的感觉输入之间进行混淆相关:通过这种方式,系统学习主要反射与更早感觉输入之间的关系,以便创建更早出现的运动反应。学习的结果是,主要反射将不再被触发,从而永久保持在其期望的静止状态。在机器人应用中,我们证明ISO学习可以通过学习将内置的反射行为(触摸后缩回)与距离传感器更早出现的信号(触摸前)相关联,成功解决经典的避障任务。最后,我们表明避障和吸引任务可以在同一个智能体中结合。

相似文献

1
Isotropic-sequence-order learning in a closed-loop behavioural system.闭环行为系统中的各向同性序列顺序学习
Philos Trans A Math Phys Eng Sci. 2003 Oct 15;361(1811):2225-44. doi: 10.1098/rsta.2003.1273.
2
Isotropic sequence order learning.各向同性序列顺序学习。
Neural Comput. 2003 Apr;15(4):831-64. doi: 10.1162/08997660360581921.
3
ISO learning approximates a solution to the inverse-controller problem in an unsupervised behavioral paradigm.
Neural Comput. 2003 Apr;15(4):865-84. doi: 10.1162/08997660360581930.
4
Chained learning architectures in a simple closed-loop behavioural context.简单闭环行为情境中的链式学习架构
Biol Cybern. 2007 Dec;97(5-6):363-78. doi: 10.1007/s00422-007-0176-y. Epub 2007 Oct 3.
5
Isotropic sequence order learning using a novel linear algorithm in a closed loop behavioural system.
Biosystems. 2002 Oct-Dec;67(1-3):195-202. doi: 10.1016/s0303-2647(02)00077-1.
6
Reinforcement learning of motor skills with policy gradients.基于策略梯度的运动技能强化学习。
Neural Netw. 2008 May;21(4):682-97. doi: 10.1016/j.neunet.2008.02.003. Epub 2008 Apr 26.
7
Computation of inverse functions in a model of cerebellar and reflex pathways allows to control a mobile mechanical segment.在小脑和反射通路模型中对反函数进行计算,可以控制一个可移动的机械部分。
Neuroscience. 2005;133(1):29-49. doi: 10.1016/j.neuroscience.2004.09.048. Epub 2005 Apr 22.
8
Strongly improved stability and faster convergence of temporal sequence learning by using input correlations only.仅通过使用输入相关性就显著提高了时间序列学习的稳定性并加快了收敛速度。
Neural Comput. 2006 Jun;18(6):1380-412. doi: 10.1162/neco.2006.18.6.1380.
9
A parameter control method in reinforcement learning to rapidly follow unexpected environmental changes.一种强化学习中用于快速跟踪意外环境变化的参数控制方法。
Biosystems. 2004 Nov;77(1-3):109-17. doi: 10.1016/j.biosystems.2004.05.001.
10
A reflexive neural network for dynamic biped walking control.一种用于动态双足步行控制的自反神经网络。
Neural Comput. 2006 May;18(5):1156-96. doi: 10.1162/089976606776241057.

引用本文的文献

1
Adaptive, fast walking in a biped robot under neuronal control and learning.在神经元控制和学习下双足机器人的自适应快速行走
PLoS Comput Biol. 2007 Jul;3(7):e134. doi: 10.1371/journal.pcbi.0030134.