主权者：一种自主神经系统，用于逐步学习规划动作序列以朝着奖励目标导航。

SOVEREIGN: An autonomous neural system for incrementally learning planned action sequences to navigate towards a rewarded goal.

作者信息

Gnadt William, Grossberg Stephen

机构信息

Department of Cognitive and Neural Systems, Center for Adaptive Systems, Center of Excellence for Learning in Education, Science and Technology, Boston University, Boston, MA 02215, United States.

出版信息

Neural Netw. 2008 Jun;21(5):699-758. doi: 10.1016/j.neunet.2007.09.016. Epub 2007 Oct 7.

DOI:10.1016/j.neunet.2007.09.016

PMID:17996419

Abstract

How do reactive and planned behaviors interact in real time? How are sequences of such behaviors released at appropriate times during autonomous navigation to realize valued goals? Controllers for both animals and mobile robots, or animats, need reactive mechanisms for exploration, and learned plans to reach goal objects once an environment becomes familiar. The SOVEREIGN (Self-Organizing, Vision, Expectation, Recognition, Emotion, Intelligent, Goal-oriented Navigation) animat model embodies these capabilities, and is tested in a 3D virtual reality environment. SOVEREIGN includes several interacting subsystems which model complementary properties of cortical What and Where processing streams and which clarify similarities between mechanisms for navigation and arm movement control. As the animat explores an environment, visual inputs are processed by networks that are sensitive to visual form and motion in the What and Where streams, respectively. Position-invariant and size-invariant recognition categories are learned by real-time incremental learning in the What stream. Estimates of target position relative to the animat are computed in the Where stream, and can activate approach movements toward the target. Motion cues from animat locomotion can elicit head-orienting movements to bring a new target into view. Approach and orienting movements are alternately performed during animat navigation. Cumulative estimates of each movement are derived from interacting proprioceptive and visual cues. Movement sequences are stored within a motor working memory. Sequences of visual categories are stored in a sensory working memory. These working memories trigger learning of sensory and motor sequence categories, or plans, which together control planned movements. Predictively effective chunk combinations are selectively enhanced via reinforcement learning when the animat is rewarded. Selected planning chunks effect a gradual transition from variable reactive exploratory movements to efficient goal-oriented planned movement sequences. Volitional signals gate interactions between model subsystems and the release of overt behaviors. The model can control different motor sequences under different motivational states and learns more efficient sequences to rewarded goals as exploration proceeds.

摘要

反应性行为和计划性动作如何实时交互？在自主导航过程中，此类行为序列如何在适当的时间被释放以实现有价值的目标？动物和移动机器人（即animat）的控制器都需要用于探索的反应机制，以及在环境变得熟悉后用于到达目标物体的学习计划。主权（自组织、视觉、期望、识别、情感、智能、目标导向导航）animat模型体现了这些能力，并在三维虚拟现实环境中进行了测试。主权模型包括几个相互作用的子系统，这些子系统对皮质的“什么”和“哪里”处理流的互补特性进行建模，并阐明导航和手臂运动控制机制之间的相似性。当animat探索环境时，视觉输入由分别对“什么”流和“哪里”流中的视觉形式和运动敏感的网络进行处理。在“什么”流中通过实时增量学习来学习位置不变和大小不变的识别类别。在“哪里”流中计算相对于animat的目标位置估计，并可激活朝向目标的接近动作。来自animat运动的运动线索可引发头部定向运动，以便将新目标带入视野。在animat导航期间交替执行接近和定向动作。每个动作的累积估计来自相互作用的本体感觉和视觉线索。运动序列存储在运动工作记忆中。视觉类别序列存储在感觉工作记忆中。这些工作记忆触发感觉和运动序列类别（即计划）的学习，这些计划共同控制计划性动作。当animat得到奖励时，通过强化学习选择性地增强预测有效的组块组合。选定的计划组块实现从可变的反应性探索动作到高效的目标导向计划性动作序列的逐渐过渡。意志信号控制模型子系统之间的相互作用以及明显行为的释放。该模型可以在不同的动机状态下控制不同的运动序列，并随着探索的进行学习更有效的到达奖励目标的序列。

相似文献

SOVEREIGN: An autonomous neural system for incrementally learning planned action sequences to navigate towards a rewarded goal.

Neural Netw. 2008 Jun;21(5):699-758. doi: 10.1016/j.neunet.2007.09.016. Epub 2007 Oct 7.

The Embodied Brain of SOVEREIGN2: From Space-Variant Conscious Percepts During Visual Search and Navigation to Learning Invariant Object Categories and Cognitive-Emotional Plans for Acquiring Valued Goals.

Front Comput Neurosci. 2019 Jun 25;13:36. doi: 10.3389/fncom.2019.00036. eCollection 2019.

Learning and generation of goal-directed arm reaching from scratch.

Neural Netw. 2009 May;22(4):348-61. doi: 10.1016/j.neunet.2008.11.004. Epub 2008 Nov 30.

How do children learn to follow gaze, share joint attention, imitate their teachers, and use tools during social interactions?

Neural Netw. 2010 Oct-Nov;23(8-9):940-65. doi: 10.1016/j.neunet.2010.07.011. Epub 2010 Aug 5.

Animats: computer-simulated animals in behavioral research.

J Anim Sci. 1998 Oct;76(10):2596-604. doi: 10.2527/1998.76102596x.

Learning to recognize objects on the fly: a neurally based dynamic field approach.

Neural Netw. 2008 May;21(4):562-76. doi: 10.1016/j.neunet.2008.03.007. Epub 2008 Apr 27.

Towards a unified theory of neocortex: laminar cortical circuits for vision and cognition.

Prog Brain Res. 2007;165:79-104. doi: 10.1016/S0079-6123(06)65006-1.

How laminar frontal cortex and basal ganglia circuits interact to control planned and reactive saccades.

Neural Netw. 2004 May;17(4):471-510. doi: 10.1016/j.neunet.2003.08.006.

Laminar cortical dynamics of cognitive and motor working memory, sequence learning and performance: toward a unified theory of how the cerebral cortex works.

Psychol Rev. 2008 Jul;115(3):677-732. doi: 10.1037/a0012618.

Contrasting effects of reward expectation on sensory and motor memories in primate prefrontal neurons.

Cereb Cortex. 2006 Jul;16(7):1002-15. doi: 10.1093/cercor/bhj042. Epub 2005 Oct 12.

引用本文的文献

A Neural Model of Intrinsic and Extrinsic Hippocampal Theta Rhythms: Anatomy, Neurophysiology, and Function.

Front Syst Neurosci. 2021 Apr 28;15:665052. doi: 10.3389/fnsys.2021.665052. eCollection 2021.

A Path Toward Explainable AI and Autonomous Adaptive Intelligence: Deep Learning, Adaptive Resonance, and Models of Perception, Emotion, and Action.

Front Neurorobot. 2020 Jun 25;14:36. doi: 10.3389/fnbot.2020.00036. eCollection 2020.

Editorial: The Embodied Brain: Computational Mechanisms of Integrated Sensorimotor Interactions With a Dynamic Environment.

Front Comput Neurosci. 2020 Jun 18;14:53. doi: 10.3389/fncom.2020.00053. eCollection 2020.

Front Comput Neurosci. 2019 Jun 25;13:36. doi: 10.3389/fncom.2019.00036. eCollection 2019.

A neural model of normal and abnormal learning and memory consolidation: adaptively timed conditioning, hippocampus, amnesia, neurotrophins, and consciousness.

Cogn Affect Behav Neurosci. 2017 Feb;17(1):24-76. doi: 10.3758/s13415-016-0463-y.

Language and cognition interaction neural mechanisms.

Comput Intell Neurosci. 2011;2011:454587. doi: 10.1155/2011/454587. Epub 2011 Aug 24.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

主权者：一种自主神经系统，用于逐步学习规划动作序列以朝着奖励目标导航。

SOVEREIGN: An autonomous neural system for incrementally learning planned action sequences to navigate towards a rewarded goal.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献