• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于动态环境中机器人导航的强化学习算法。

Reinforcement learning algorithms for robotic navigation in dynamic environments.

作者信息

Yen Gary G, Hickey Travis W

机构信息

Intelligent Systems and Control Laboratory, School of Electrical and Computer Engineering, Oklahoma State University, Stillwater, OK 74078, USA.

出版信息

ISA Trans. 2004 Apr;43(2):217-30. doi: 10.1016/s0019-0578(07)60032-9.

DOI:10.1016/s0019-0578(07)60032-9
PMID:15098582
Abstract

The purpose of this study was to examine improvements to reinforcement learning (RL) algorithms in order to successfully interact within dynamic environments. The scope of the research was that of RL algorithms as applied to robotic navigation. Proposed improvements include: addition of a forgetting mechanism, use of feature based state inputs, and hierarchical structuring of an RL agent. Simulations were performed to evaluate the individual merits and flaws of each proposal, to compare proposed methods to prior established methods, and to compare proposed methods to theoretically optimal solutions. Incorporation of a forgetting mechanism did considerably improve the learning times of RL agents in a dynamic environment. However, direct implementation of a feature-based RL agent did not result in any performance enhancements, as pure feature-based navigation results in a lack of positional awareness, and the inability of the agent to determine the location of the goal state. Inclusion of a hierarchical structure in an RL agent resulted in significantly improved performance, specifically when one layer of the hierarchy included a feature-based agent for obstacle avoidance, and a standard RL agent for global navigation. In summary, the inclusion of a forgetting mechanism, and the use of a hierarchically structured RL agent offer substantially increased performance when compared to traditional RL agents navigating in a dynamic environment.

摘要

本研究的目的是检验强化学习(RL)算法的改进,以便在动态环境中成功进行交互。研究范围是应用于机器人导航的RL算法。提出的改进措施包括:添加遗忘机制、使用基于特征的状态输入以及对RL智能体进行分层结构设计。进行了模拟,以评估每个提议的优缺点,将提议的方法与先前已建立的方法进行比较,并将提议的方法与理论上的最优解决方案进行比较。在动态环境中,纳入遗忘机制确实显著缩短了RL智能体的学习时间。然而,直接实施基于特征的RL智能体并没有带来任何性能提升,因为纯粹基于特征的导航会导致缺乏位置感知,且智能体无法确定目标状态的位置。在RL智能体中纳入分层结构可显著提高性能,特别是当层次结构的一层包括用于避障的基于特征的智能体和用于全局导航的标准RL智能体时。总之,与在动态环境中导航的传统RL智能体相比,纳入遗忘机制以及使用分层结构的RL智能体可大幅提高性能。

相似文献

1
Reinforcement learning algorithms for robotic navigation in dynamic environments.用于动态环境中机器人导航的强化学习算法。
ISA Trans. 2004 Apr;43(2):217-30. doi: 10.1016/s0019-0578(07)60032-9.
2
RL-DOVS: Reinforcement Learning for Autonomous Robot Navigation in Dynamic Environments.RL-DOVS:动态环境下自主机器人导航的强化学习。
Sensors (Basel). 2022 May 19;22(10):3847. doi: 10.3390/s22103847.
3
Reinforcement Learning Algorithms and Applications in Healthcare and Robotics: A Comprehensive and Systematic Review.强化学习算法及其在医疗保健与机器人技术中的应用:一项全面且系统的综述
Sensors (Basel). 2024 Apr 11;24(8):2461. doi: 10.3390/s24082461.
4
MOSAIC for multiple-reward environments.多奖励环境下的 MOSAIC 算法。
Neural Comput. 2012 Mar;24(3):577-606. doi: 10.1162/NECO_a_00246. Epub 2011 Dec 14.
5
A clustering-based graph Laplacian framework for value function approximation in reinforcement learning.基于聚类的图拉普拉斯强化学习中值函数逼近框架。
IEEE Trans Cybern. 2014 Dec;44(12):2613-25. doi: 10.1109/TCYB.2014.2311578. Epub 2014 Apr 25.
6
SOVEREIGN: An autonomous neural system for incrementally learning planned action sequences to navigate towards a rewarded goal.主权者:一种自主神经系统,用于逐步学习规划动作序列以朝着奖励目标导航。
Neural Netw. 2008 Jun;21(5):699-758. doi: 10.1016/j.neunet.2007.09.016. Epub 2007 Oct 7.
7
Application of reinforcement learning in cognitive radio networks: models and algorithms.强化学习在认知无线电网络中的应用:模型与算法
ScientificWorldJournal. 2014;2014:209810. doi: 10.1155/2014/209810. Epub 2014 Jun 5.
8
Human locomotion with reinforcement learning using bioinspired reward reshaping strategies.基于生物启发式奖励重塑策略的强化学习的人类运动。
Med Biol Eng Comput. 2021 Jan;59(1):243-256. doi: 10.1007/s11517-020-02309-3. Epub 2021 Jan 8.
9
Kernel-based least squares policy iteration for reinforcement learning.用于强化学习的基于核的最小二乘策略迭代
IEEE Trans Neural Netw. 2007 Jul;18(4):973-92. doi: 10.1109/TNN.2007.899161.
10
Ensemble algorithms in reinforcement learning.强化学习中的集成算法。
IEEE Trans Syst Man Cybern B Cybern. 2008 Aug;38(4):930-6. doi: 10.1109/TSMCB.2008.920231.

引用本文的文献

1
Path-finding in real and simulated rats: assessing the influence of path characteristics on navigation learning.真实大鼠与模拟大鼠的路径寻找:评估路径特征对导航学习的影响。
J Comput Neurosci. 2008 Dec;25(3):562-82. doi: 10.1007/s10827-008-0094-6. Epub 2008 Apr 30.