用于动态环境中机器人导航的强化学习算法。

Reinforcement learning algorithms for robotic navigation in dynamic environments.

作者信息

Yen Gary G, Hickey Travis W

机构信息

Intelligent Systems and Control Laboratory, School of Electrical and Computer Engineering, Oklahoma State University, Stillwater, OK 74078, USA.

出版信息

ISA Trans. 2004 Apr;43(2):217-30. doi: 10.1016/s0019-0578(07)60032-9.

DOI:10.1016/s0019-0578(07)60032-9

PMID:15098582

Abstract

The purpose of this study was to examine improvements to reinforcement learning (RL) algorithms in order to successfully interact within dynamic environments. The scope of the research was that of RL algorithms as applied to robotic navigation. Proposed improvements include: addition of a forgetting mechanism, use of feature based state inputs, and hierarchical structuring of an RL agent. Simulations were performed to evaluate the individual merits and flaws of each proposal, to compare proposed methods to prior established methods, and to compare proposed methods to theoretically optimal solutions. Incorporation of a forgetting mechanism did considerably improve the learning times of RL agents in a dynamic environment. However, direct implementation of a feature-based RL agent did not result in any performance enhancements, as pure feature-based navigation results in a lack of positional awareness, and the inability of the agent to determine the location of the goal state. Inclusion of a hierarchical structure in an RL agent resulted in significantly improved performance, specifically when one layer of the hierarchy included a feature-based agent for obstacle avoidance, and a standard RL agent for global navigation. In summary, the inclusion of a forgetting mechanism, and the use of a hierarchically structured RL agent offer substantially increased performance when compared to traditional RL agents navigating in a dynamic environment.

摘要

本研究的目的是检验强化学习（RL）算法的改进，以便在动态环境中成功进行交互。研究范围是应用于机器人导航的RL算法。提出的改进措施包括：添加遗忘机制、使用基于特征的状态输入以及对RL智能体进行分层结构设计。进行了模拟，以评估每个提议的优缺点，将提议的方法与先前已建立的方法进行比较，并将提议的方法与理论上的最优解决方案进行比较。在动态环境中，纳入遗忘机制确实显著缩短了RL智能体的学习时间。然而，直接实施基于特征的RL智能体并没有带来任何性能提升，因为纯粹基于特征的导航会导致缺乏位置感知，且智能体无法确定目标状态的位置。在RL智能体中纳入分层结构可显著提高性能，特别是当层次结构的一层包括用于避障的基于特征的智能体和用于全局导航的标准RL智能体时。总之，与在动态环境中导航的传统RL智能体相比，纳入遗忘机制以及使用分层结构的RL智能体可大幅提高性能。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于动态环境中机器人导航的强化学习算法。

Reinforcement learning algorithms for robotic navigation in dynamic environments.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

用于动态环境中机器人导航的强化学习算法。

Reinforcement learning algorithms for robotic navigation in dynamic environments.

作者信息

机构信息

出版信息

相似文献

引用本文的文献