• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

网格环境中具有梦境的分层内在动机智能体规划行为

Hierarchical intrinsically motivated agent planning behavior with dreaming in grid environments.

作者信息

Dzhivelikian Evgenii, Latyshev Artem, Kuderov Petr, Panov Aleksandr I

机构信息

Moscow Institute of Physics and Technology, Dolgoprudny, Russia.

Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, Moscow, Russia.

出版信息

Brain Inform. 2022 Apr 2;9(1):8. doi: 10.1186/s40708-022-00156-6.

DOI:10.1186/s40708-022-00156-6
PMID:35366128
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8976870/
Abstract

Biologically plausible models of learning may provide a crucial insight for building autonomous intelligent agents capable of performing a wide range of tasks. In this work, we propose a hierarchical model of an agent operating in an unfamiliar environment driven by a reinforcement signal. We use temporal memory to learn sparse distributed representation of state-actions and the basal ganglia model to learn effective action policy on different levels of abstraction. The learned model of the environment is utilized to generate an intrinsic motivation signal, which drives the agent in the absence of the extrinsic signal, and through acting in imagination, which we call dreaming. We demonstrate that the proposed architecture enables an agent to effectively reach goals in grid environments.

摘要

具有生物学合理性的学习模型可能为构建能够执行广泛任务的自主智能体提供关键见解。在这项工作中,我们提出了一个在强化信号驱动下在陌生环境中运行的智能体分层模型。我们使用时间记忆来学习状态-动作的稀疏分布式表示,并使用基底神经节模型在不同抽象层次上学习有效的动作策略。所学习的环境模型用于生成内在动机信号,该信号在没有外在信号时驱动智能体,并通过在想象中行动,我们称之为做梦。我们证明所提出的架构能够使智能体在网格环境中有效地实现目标。

相似文献

1
Hierarchical intrinsically motivated agent planning behavior with dreaming in grid environments.网格环境中具有梦境的分层内在动机智能体规划行为
Brain Inform. 2022 Apr 2;9(1):8. doi: 10.1186/s40708-022-00156-6.
2
Feature Control as Intrinsic Motivation for Hierarchical Reinforcement Learning.特征控制作为分层强化学习的内在动机。
IEEE Trans Neural Netw Learn Syst. 2019 Nov;30(11):3409-3418. doi: 10.1109/TNNLS.2019.2891792. Epub 2019 Jan 29.
3
End-to-End Autonomous Exploration with Deep Reinforcement Learning and Intrinsic Motivation.端到端自主探索的深度强化学习和内在动机。
Comput Intell Neurosci. 2021 Dec 16;2021:9945044. doi: 10.1155/2021/9945044. eCollection 2021.
4
Learning a Set of Interrelated Tasks by Using a Succession of Motor Policies for a Socially Guided Intrinsically Motivated Learner.通过为具有社会引导的内在动机的学习者使用一系列运动策略来学习一组相互关联的任务。
Front Neurorobot. 2019 Jan 8;12:87. doi: 10.3389/fnbot.2018.00087. eCollection 2018.
5
Incremental learning of skill collections based on intrinsic motivation.基于内在动机的技能集合的增量学习。
Front Neurorobot. 2013 Jul 26;7:11. doi: 10.3389/fnbot.2013.00011. eCollection 2013.
6
Intrinsically motivated action-outcome learning and goal-based action recall: a system-level bio-constrained computational model.内在动机驱动的动作-结果学习和基于目标的动作回忆:一种系统级的生物约束计算模型。
Neural Netw. 2013 May;41:168-87. doi: 10.1016/j.neunet.2012.09.015. Epub 2012 Oct 4.
7
Intrinsically Motivated Exploration of Learned Goal Spaces.对所学目标空间的内在动机探索。
Front Neurorobot. 2021 Jan 12;14:555271. doi: 10.3389/fnbot.2020.555271. eCollection 2020.
8
Which is the best intrinsic motivation signal for learning multiple skills?哪种内源性动机信号最适合学习多种技能?
Front Neurorobot. 2013 Nov 12;7:22. doi: 10.3389/fnbot.2013.00022. eCollection 2013.
9
Expanding the Active Inference Landscape: More Intrinsic Motivations in the Perception-Action Loop.拓展主动推理的范畴:感知-行动循环中的更多内在动机
Front Neurorobot. 2018 Aug 30;12:45. doi: 10.3389/fnbot.2018.00045. eCollection 2018.
10
Leveraging Predictions of Task-Related Latents for Interactive Visual Navigation.利用任务相关潜在因素的预测进行交互式视觉导航。
IEEE Trans Neural Netw Learn Syst. 2025 Jan;36(1):704-717. doi: 10.1109/TNNLS.2023.3335416. Epub 2025 Jan 7.

引用本文的文献

1
IoT and Deep Learning-Based Farmer Safety System.基于物联网和深度学习的农民安全系统。
Sensors (Basel). 2023 Mar 8;23(6):2951. doi: 10.3390/s23062951.

本文引用的文献

1
Mastering Atari, Go, chess and shogi by planning with a learned model.通过使用学习模型进行规划,掌握 Atari、围棋、国际象棋和将棋。
Nature. 2020 Dec;588(7839):604-609. doi: 10.1038/s41586-020-03051-4. Epub 2020 Dec 23.
2
Continual lifelong learning with neural networks: A review.神经网络的持续终身学习:综述。
Neural Netw. 2019 May;113:54-71. doi: 10.1016/j.neunet.2019.01.012. Epub 2019 Feb 6.
3
The HTM Spatial Pooler-A Neocortical Algorithm for Online Sparse Distributed Coding.HTM空间池化器——一种用于在线稀疏分布式编码的新皮层算法。
Front Comput Neurosci. 2017 Nov 29;11:111. doi: 10.3389/fncom.2017.00111. eCollection 2017.
4
A Theory of How Columns in the Neocortex Enable Learning the Structure of the World.一种关于新皮层中的柱状体如何使我们能够学习世界结构的理论。
Front Neural Circuits. 2017 Oct 25;11:81. doi: 10.3389/fncir.2017.00081. eCollection 2017.
5
Mastering the game of Go without human knowledge.无需人类知识即可掌握围棋游戏。
Nature. 2017 Oct 18;550(7676):354-359. doi: 10.1038/nature24270.
6
Neuroscience-Inspired Artificial Intelligence.神经科学启发的人工智能。
Neuron. 2017 Jul 19;95(2):245-258. doi: 10.1016/j.neuron.2017.06.011.
7
Continuous Online Sequence Learning with an Unsupervised Neural Network Model.使用无监督神经网络模型的连续在线序列学习
Neural Comput. 2016 Nov;28(11):2474-2504. doi: 10.1162/NECO_a_00893. Epub 2016 Sep 14.
8
Why Neurons Have Thousands of Synapses, a Theory of Sequence Memory in Neocortex.为何神经元拥有数千个突触:新皮层序列记忆理论
Front Neural Circuits. 2016 Mar 30;10:23. doi: 10.3389/fncir.2016.00023. eCollection 2016.
9
Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。
Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.
10
Keep focussing: striatal dopamine multiple functions resolved in a single mechanism tested in a simulated humanoid robot.保持专注:纹状体多巴胺的多种功能在一个模拟人形机器人中得到验证的单一机制中得到解决。
Front Psychol. 2014 Feb 21;5:124. doi: 10.3389/fpsyg.2014.00124. eCollection 2014.