Suppr超能文献

主权者:一种自主神经系统,用于逐步学习规划动作序列以朝着奖励目标导航。

SOVEREIGN: An autonomous neural system for incrementally learning planned action sequences to navigate towards a rewarded goal.

作者信息

Gnadt William, Grossberg Stephen

机构信息

Department of Cognitive and Neural Systems, Center for Adaptive Systems, Center of Excellence for Learning in Education, Science and Technology, Boston University, Boston, MA 02215, United States.

出版信息

Neural Netw. 2008 Jun;21(5):699-758. doi: 10.1016/j.neunet.2007.09.016. Epub 2007 Oct 7.

Abstract

How do reactive and planned behaviors interact in real time? How are sequences of such behaviors released at appropriate times during autonomous navigation to realize valued goals? Controllers for both animals and mobile robots, or animats, need reactive mechanisms for exploration, and learned plans to reach goal objects once an environment becomes familiar. The SOVEREIGN (Self-Organizing, Vision, Expectation, Recognition, Emotion, Intelligent, Goal-oriented Navigation) animat model embodies these capabilities, and is tested in a 3D virtual reality environment. SOVEREIGN includes several interacting subsystems which model complementary properties of cortical What and Where processing streams and which clarify similarities between mechanisms for navigation and arm movement control. As the animat explores an environment, visual inputs are processed by networks that are sensitive to visual form and motion in the What and Where streams, respectively. Position-invariant and size-invariant recognition categories are learned by real-time incremental learning in the What stream. Estimates of target position relative to the animat are computed in the Where stream, and can activate approach movements toward the target. Motion cues from animat locomotion can elicit head-orienting movements to bring a new target into view. Approach and orienting movements are alternately performed during animat navigation. Cumulative estimates of each movement are derived from interacting proprioceptive and visual cues. Movement sequences are stored within a motor working memory. Sequences of visual categories are stored in a sensory working memory. These working memories trigger learning of sensory and motor sequence categories, or plans, which together control planned movements. Predictively effective chunk combinations are selectively enhanced via reinforcement learning when the animat is rewarded. Selected planning chunks effect a gradual transition from variable reactive exploratory movements to efficient goal-oriented planned movement sequences. Volitional signals gate interactions between model subsystems and the release of overt behaviors. The model can control different motor sequences under different motivational states and learns more efficient sequences to rewarded goals as exploration proceeds.

摘要

反应性行为和计划性动作如何实时交互?在自主导航过程中,此类行为序列如何在适当的时间被释放以实现有价值的目标?动物和移动机器人(即animat)的控制器都需要用于探索的反应机制,以及在环境变得熟悉后用于到达目标物体的学习计划。主权(自组织、视觉、期望、识别、情感、智能、目标导向导航)animat模型体现了这些能力,并在三维虚拟现实环境中进行了测试。主权模型包括几个相互作用的子系统,这些子系统对皮质的“什么”和“哪里”处理流的互补特性进行建模,并阐明导航和手臂运动控制机制之间的相似性。当animat探索环境时,视觉输入由分别对“什么”流和“哪里”流中的视觉形式和运动敏感的网络进行处理。在“什么”流中通过实时增量学习来学习位置不变和大小不变的识别类别。在“哪里”流中计算相对于animat的目标位置估计,并可激活朝向目标的接近动作。来自animat运动的运动线索可引发头部定向运动,以便将新目标带入视野。在animat导航期间交替执行接近和定向动作。每个动作的累积估计来自相互作用的本体感觉和视觉线索。运动序列存储在运动工作记忆中。视觉类别序列存储在感觉工作记忆中。这些工作记忆触发感觉和运动序列类别(即计划)的学习,这些计划共同控制计划性动作。当animat得到奖励时,通过强化学习选择性地增强预测有效的组块组合。选定的计划组块实现从可变的反应性探索动作到高效的目标导向计划性动作序列的逐渐过渡。意志信号控制模型子系统之间的相互作用以及明显行为的释放。该模型可以在不同的动机状态下控制不同的运动序列,并随着探索的进行学习更有效的到达奖励目标的序列。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验