基于速度范围的奖励塑造技术，用于通过激光雷达传感器和深度强化学习实现有效的无地图导航。

Velocity range-based reward shaping technique for effective map-less navigation with LiDAR sensor and deep reinforcement learning.

作者信息

Lee HyeokSoo, Jeong Jongpil

机构信息

Department of Smart Factory Convergence, AI Factory Lab, Sungkyunkwan University, Suwon, Republic of Korea.

Research & Development Team, THiRA-UTECH Co., Ltd., Seoul, Republic of Korea.

出版信息

Front Neurorobot. 2023 Sep 6;17:1210442. doi: 10.3389/fnbot.2023.1210442. eCollection 2023.

DOI:10.3389/fnbot.2023.1210442

PMID:37744086

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10512054/

Abstract

In recent years, sensor components similar to human sensory functions have been rapidly developed in the hardware field, enabling the acquisition of information at a level beyond that of humans, and in the software field, artificial intelligence technology has been utilized to enable cognitive abilities and decision-making such as prediction, analysis, and judgment. These changes are being utilized in various industries and fields. In particular, new hardware and software technologies are being rapidly applied to robotics products, showing a level of performance and completeness that was previously unimaginable. In this paper, we researched the topic of establishing an optimal path plan for autonomous driving using LiDAR sensors and deep reinforcement learning in a workplace without map and grid coordinates for mobile robots, which are widely used in logistics and manufacturing sites. For this purpose, we reviewed the hardware configuration of mobile robots capable of autonomous driving, checked the characteristics of the main core sensors, and investigated the core technologies of autonomous driving. In addition, we reviewed the appropriate deep reinforcement learning algorithm to realize the autonomous driving of mobile robots, defined a deep neural network for autonomous driving data conversion, and defined a reward function for path planning. The contents investigated in this paper were built into a simulation environment to verify the autonomous path planning through experiment, and an additional reward technique "Velocity Range-based Evaluation Method" was proposed for further improvement of performance indicators required in the real field, and the effectiveness was verified. The simulation environment and detailed results of experiments are described in this paper, and it is expected as guidance and reference research for applying these technologies in the field.

摘要

近年来，在硬件领域，类似于人类感官功能的传感器组件得到了迅速发展，使得信息采集能力超越了人类水平；在软件领域，人工智能技术被用于实现诸如预测、分析和判断等认知能力与决策。这些变化正被应用于各个行业和领域。特别是，新的硬件和软件技术正在迅速应用于机器人产品，展现出前所未有的性能和完整性水平。在本文中，我们研究了在物流和制造场所广泛使用的、无地图和网格坐标的工作场所中，利用激光雷达传感器和深度强化学习为移动机器人建立自动驾驶最优路径规划的课题。为此，我们回顾了具备自动驾驶能力的移动机器人的硬件配置，检查了主要核心传感器的特性，并研究了自动驾驶的核心技术。此外，我们回顾了实现移动机器人自动驾驶的合适深度强化学习算法，定义了用于自动驾驶数据转换的深度神经网络，并定义了路径规划的奖励函数。本文所研究的内容被构建到一个模拟环境中，通过实验验证自主路径规划，并提出了一种额外的奖励技术“基于速度范围的评估方法”，以进一步提高实际场景中所需性能指标，验证了其有效性。本文描述了模拟环境和详细的实验结果，期望能为这些技术在该领域的应用提供指导和参考研究。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于速度范围的奖励塑造技术，用于通过激光雷达传感器和深度强化学习实现有效的无地图导航。

Velocity range-based reward shaping technique for effective map-less navigation with LiDAR sensor and deep reinforcement learning.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

基于速度范围的奖励塑造技术，用于通过激光雷达传感器和深度强化学习实现有效的无地图导航。

Velocity range-based reward shaping technique for effective map-less navigation with LiDAR sensor and deep reinforcement learning.

作者信息

机构信息

出版信息

相似文献

本文引用的文献