• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于地图的深度强化学习实现分布式非通信多机器人避碰

Distributed Non-Communicating Multi-Robot Collision Avoidance via Map-Based Deep Reinforcement Learning.

作者信息

Chen Guangda, Yao Shunyi, Ma Jun, Pan Lifan, Chen Yu'an, Xu Pei, Ji Jianmin, Chen Xiaoping

机构信息

School of Computer Science and Technology, University of Science and Technology of China, Hefei 230026, China.

School of Data Science, University of Science and Technology of China, Hefei 230026, China.

出版信息

Sensors (Basel). 2020 Aug 27;20(17):4836. doi: 10.3390/s20174836.

DOI:10.3390/s20174836
PMID:32867080
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7506975/
Abstract

It is challenging to avoid obstacles safely and efficiently for multiple robots of different shapes in distributed and communication-free scenarios, where robots do not communicate with each other and only sense other robots' positions and obstacles around them. Most existing multi-robot collision avoidance systems either require communication between robots or require expensive movement data of other robots, like velocities, accelerations and paths. In this paper, we propose a map-based deep reinforcement learning approach for multi-robot collision avoidance in a distributed and communication-free environment. We use the egocentric local grid map of a robot to represent the environmental information around it including its shape and observable appearances of other robots and obstacles, which can be easily generated by using multiple sensors or sensor fusion. Then we apply the distributed proximal policy optimization (DPPO) algorithm to train a convolutional neural network that directly maps three frames of egocentric local grid maps and the robot's relative local goal positions into low-level robot control commands. Compared to other methods, the map-based approach is more robust to noisy sensor data, does not require robots' movement data and considers sizes and shapes of related robots, which make it to be more efficient and easier to be deployed to real robots. We first train the neural network in a specified simulator of multiple mobile robots using DPPO, where a multi-stage curriculum learning strategy for multiple scenarios is used to improve the performance. Then we deploy the trained model to real robots to perform collision avoidance in their navigation without tedious parameter tuning. We evaluate the approach with multiple scenarios both in the simulator and on four differential-drive mobile robots in the real world. Both qualitative and quantitative experiments show that our approach is efficient and outperforms existing DRL-based approaches in many indicators. We also conduct ablation studies showing the positive effects of using egocentric grid maps and multi-stage curriculum learning.

摘要

在分布式且无通信的场景中,对于多个形状各异的机器人而言,要安全、高效地避开障碍物颇具挑战性。在这种场景下,机器人彼此之间不进行通信,仅感知其他机器人的位置以及周围的障碍物。大多数现有的多机器人碰撞避免系统要么需要机器人之间进行通信,要么需要其他机器人昂贵的运动数据,如速度、加速度和路径等。在本文中,我们提出了一种基于地图的深度强化学习方法,用于在分布式且无通信的环境中实现多机器人碰撞避免。我们使用机器人的以自我为中心的局部网格地图来表示其周围的环境信息,包括其自身形状以及其他机器人和障碍物的可观测外观,这可以通过使用多个传感器或传感器融合轻松生成。然后,我们应用分布式近端策略优化(DPPO)算法来训练一个卷积神经网络,该网络直接将三帧以自我为中心的局部网格地图以及机器人的相对局部目标位置映射为低级机器人控制命令。与其他方法相比,基于地图的方法对噪声传感器数据更具鲁棒性,不需要机器人的运动数据,并且考虑了相关机器人的尺寸和形状,这使其更高效且更易于部署到实际机器人上。我们首先在多个移动机器人的指定模拟器中使用DPPO训练神经网络,其中针对多个场景采用多阶段课程学习策略来提高性能。然后,我们将训练好的模型部署到实际机器人上,以便在其导航过程中执行碰撞避免,而无需进行繁琐的参数调整。我们在模拟器以及现实世界中的四个差动驱动移动机器人上用多个场景对该方法进行了评估。定性和定量实验均表明,我们的方法是高效的,并且在许多指标上优于现有的基于深度强化学习的方法。我们还进行了消融研究,展示了使用以自我为中心的网格地图和多阶段课程学习的积极效果。

相似文献

1
Distributed Non-Communicating Multi-Robot Collision Avoidance via Map-Based Deep Reinforcement Learning.基于地图的深度强化学习实现分布式非通信多机器人避碰
Sensors (Basel). 2020 Aug 27;20(17):4836. doi: 10.3390/s20174836.
2
Non-Communication Decentralized Multi-Robot Collision Avoidance in Grid Map Workspace with Double Deep Q-Network.基于双深度Q网络的网格地图工作空间中的非通信分散式多机器人碰撞避免
Sensors (Basel). 2021 Jan 27;21(3):841. doi: 10.3390/s21030841.
3
Sensor Fusion Based Model for Collision Free Mobile Robot Navigation.基于传感器融合的无碰撞移动机器人导航模型
Sensors (Basel). 2015 Dec 26;16(1):24. doi: 10.3390/s16010024.
4
Multi-robot collision avoidance method in sweet potato fields.甘薯田多机器人避撞方法
Front Plant Sci. 2024 Sep 10;15:1393541. doi: 10.3389/fpls.2024.1393541. eCollection 2024.
5
The Impact of LiDAR Configuration on Goal-Based Navigation within a Deep Reinforcement Learning Framework.激光雷达配置对深度强化学习框架内基于目标的导航的影响
Sensors (Basel). 2023 Dec 9;23(24):9732. doi: 10.3390/s23249732.
6
Autonomous Navigation by Mobile Robot with Sensor Fusion Based on Deep Reinforcement Learning.基于深度强化学习的传感器融合移动机器人自主导航
Sensors (Basel). 2024 Jun 16;24(12):3895. doi: 10.3390/s24123895.
7
Reinforcement learning-based dynamic obstacle avoidance and integration of path planning.基于强化学习的动态避障与路径规划集成
Intell Serv Robot. 2021;14(5):663-677. doi: 10.1007/s11370-021-00387-2. Epub 2021 Oct 6.
8
Path planning and collision avoidance methods for distributed multi-robot systems in complex dynamic environments.复杂动态环境下分布式多机器人系统的路径规划与避碰方法
Math Biosci Eng. 2023 Jan;20(1):145-178. doi: 10.3934/mbe.2023008. Epub 2022 Sep 30.
9
ITC: Infused Tangential Curves for Smooth 2D and 3D Navigation of Mobile Robots .ITC:用于移动机器人平滑 2D 和 3D 导航的注入切向曲线。
Sensors (Basel). 2019 Oct 10;19(20):4384. doi: 10.3390/s19204384.
10
Deep reinforcement learning-aided autonomous navigation with landmark generators.基于地标生成器的深度强化学习辅助自主导航。
Front Neurorobot. 2023 Aug 22;17:1200214. doi: 10.3389/fnbot.2023.1200214. eCollection 2023.

引用本文的文献

1
Non-Communication Decentralized Multi-Robot Collision Avoidance in Grid Map Workspace with Double Deep Q-Network.基于双深度Q网络的网格地图工作空间中的非通信分散式多机器人碰撞避免
Sensors (Basel). 2021 Jan 27;21(3):841. doi: 10.3390/s21030841.

本文引用的文献

1
Grandmaster level in StarCraft II using multi-agent reinforcement learning.星际争霸 II 中的大师级水平使用多智能体强化学习。
Nature. 2019 Nov;575(7782):350-354. doi: 10.1038/s41586-019-1724-z. Epub 2019 Oct 30.
2
A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play.一种通过自我对弈掌握国际象棋、将棋和围棋的通用强化学习算法。
Science. 2018 Dec 7;362(6419):1140-1144. doi: 10.1126/science.aar6404.
3
Mastering the game of Go without human knowledge.无需人类知识即可掌握围棋游戏。
Nature. 2017 Oct 18;550(7676):354-359. doi: 10.1038/nature24270.
4
Learning and development in neural networks: the importance of starting small.神经网络中的学习与发展:从小处着手的重要性。
Cognition. 1993 Jul;48(1):71-99. doi: 10.1016/0010-0277(93)90058-4.