• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于融合策略迁移学习的机器人避障与泛化方法研究

Research on Robot Obstacle Avoidance and Generalization Methods Based on Fusion Policy Transfer Learning.

作者信息

Wang Suyu, Xu Zhenlei, Qiao Peihong, Yue Quan, Ke Ya, Gao Feng

机构信息

School of Mechanical and Electrical Engineering, China University of Mining and Technology, Beijing 100083, China.

Institute of Intelligent Mining and Robotics, China University of Mining and Technology, Beijing 100083, China.

出版信息

Biomimetics (Basel). 2025 Jul 25;10(8):493. doi: 10.3390/biomimetics10080493.

DOI:10.3390/biomimetics10080493
PMID:40862866
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12383643/
Abstract

In nature, organisms often rely on the integration of local sensory information and prior experience to flexibly adapt to complex and dynamic environments, enabling efficient path selection. This bio-inspired mechanism of perception and behavioral adjustment provides important insights for path planning in mobile robots operating under uncertainty. In recent years, the introduction of deep reinforcement learning (DRL) has empowered mobile robots to autonomously learn navigation strategies through interaction with the environment, allowing them to identify obstacle distributions and perform path planning even in unknown scenarios. To further enhance the adaptability and path planning performance of robots in complex environments, this paper develops a deep reinforcement learning framework based on the Soft Actor-Critic (SAC) algorithm. First, to address the limited adaptability of existing transfer learning methods, we propose an action-level fusion mechanism that dynamically integrates prior and current policies during inference, enabling more flexible knowledge transfer. Second, a bio-inspired radar perception optimization method is introduced, which mimics the biological mechanism of focusing on key regions while ignoring redundant information, thereby enhancing the expressiveness of sensory inputs. Finally, a reward function based on ineffective behavior recognition is designed to reduce unnecessary exploration during training. The proposed method is validated in both the Gazebo simulation environment and real-world scenarios. Experimental results demonstrate that the approach achieves faster convergence and superior obstacle avoidance performance in path planning tasks, exhibiting strong transferability and generalization across various obstacle configurations.

摘要

在自然界中,生物体常常依靠整合局部感官信息和先前经验来灵活适应复杂多变的环境,从而实现高效的路径选择。这种受生物启发的感知和行为调整机制为在不确定性环境下运行的移动机器人的路径规划提供了重要的见解。近年来,深度强化学习(DRL)的引入使移动机器人能够通过与环境交互自主学习导航策略,使其即使在未知场景中也能识别障碍物分布并进行路径规划。为了进一步提高机器人在复杂环境中的适应性和路径规划性能,本文基于软演员-评论家(SAC)算法开发了一个深度强化学习框架。首先,为了解决现有迁移学习方法适应性有限的问题,我们提出了一种动作级融合机制,该机制在推理过程中动态整合先前和当前策略,实现更灵活的知识迁移。其次,引入了一种受生物启发的雷达感知优化方法,该方法模仿了关注关键区域而忽略冗余信息的生物机制,从而增强了感官输入的表现力。最后,设计了一种基于无效行为识别的奖励函数,以减少训练过程中的不必要探索。所提出的方法在Gazebo仿真环境和实际场景中均得到了验证。实验结果表明,该方法在路径规划任务中实现了更快的收敛速度和卓越的避障性能,在各种障碍物配置下均表现出强大的可迁移性和泛化能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/84373f5d46c1/biomimetics-10-00493-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/1baa6236bdec/biomimetics-10-00493-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/50645f6e3802/biomimetics-10-00493-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/3f4c88e046c8/biomimetics-10-00493-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/615bed60ae08/biomimetics-10-00493-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/a26cb328e770/biomimetics-10-00493-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/0f2b68866e35/biomimetics-10-00493-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/195275404ea7/biomimetics-10-00493-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/68096a72a63e/biomimetics-10-00493-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/4b00ce14c5f6/biomimetics-10-00493-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/68bdbe5f3661/biomimetics-10-00493-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/63caec3389be/biomimetics-10-00493-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/84373f5d46c1/biomimetics-10-00493-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/1baa6236bdec/biomimetics-10-00493-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/50645f6e3802/biomimetics-10-00493-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/3f4c88e046c8/biomimetics-10-00493-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/615bed60ae08/biomimetics-10-00493-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/a26cb328e770/biomimetics-10-00493-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/0f2b68866e35/biomimetics-10-00493-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/195275404ea7/biomimetics-10-00493-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/68096a72a63e/biomimetics-10-00493-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/4b00ce14c5f6/biomimetics-10-00493-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/68bdbe5f3661/biomimetics-10-00493-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/63caec3389be/biomimetics-10-00493-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b33/12383643/84373f5d46c1/biomimetics-10-00493-g012.jpg

相似文献

1
Research on Robot Obstacle Avoidance and Generalization Methods Based on Fusion Policy Transfer Learning.基于融合策略迁移学习的机器人避障与泛化方法研究
Biomimetics (Basel). 2025 Jul 25;10(8):493. doi: 10.3390/biomimetics10080493.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Research on Disaster Environment Map Fusion Construction and Reinforcement Learning Navigation Technology Based on Air-Ground Collaborative Multi-Heterogeneous Robot Systems.基于空地协同多异构机器人系统的灾害环境地图融合构建与强化学习导航技术研究
Sensors (Basel). 2025 Aug 12;25(16):4988. doi: 10.3390/s25164988.
4
Actor critic with experience replay-based automatic treatment planning for prostate cancer intensity modulated radiotherapy.基于经验回放的演员-评论家算法用于前列腺癌调强放射治疗的自动治疗计划
Med Phys. 2025 Jul;52(7):e17915. doi: 10.1002/mp.17915. Epub 2025 May 31.
5
Multithreaded Asynchronous Deep Reinforcement Learning With Multisensor Fusion for Robot Collision Avoidance.用于机器人碰撞避免的多传感器融合多线程异步深度强化学习
IEEE Trans Neural Netw Learn Syst. 2025 Sep;36(9):16128-16142. doi: 10.1109/TNNLS.2025.3556438.
6
A multi-robot collaborative manipulation framework for dynamic and obstacle-dense environments: integration of deep learning for real-time task execution.一种用于动态和障碍物密集环境的多机器人协作操纵框架:集成深度学习以实现实时任务执行。
Front Robot AI. 2025 Jul 30;12:1585544. doi: 10.3389/frobt.2025.1585544. eCollection 2025.
7
Image dehazing algorithm based on deep transfer learning and local mean adaptation.基于深度迁移学习和局部均值自适应的图像去雾算法
Sci Rep. 2025 Jul 31;15(1):27956. doi: 10.1038/s41598-025-13613-z.
8
Short-Term Memory Impairment短期记忆障碍
9
Design a path - planning strategy for mobile robot in multi-structured environment based on distributional reinforcement learning.基于分布式强化学习设计多结构环境下移动机器人的路径规划策略。
MethodsX. 2025 Aug 7;15:103554. doi: 10.1016/j.mex.2025.103554. eCollection 2025 Dec.
10
Research of UAV 3D path planning based on improved Dwarf mongoose algorithm with multiple strategies.基于改进的多策略侏儒 mongoose 算法的无人机三维路径规划研究
Sci Rep. 2025 Jul 24;15(1):26979. doi: 10.1038/s41598-025-11492-y.

本文引用的文献

1
An Enhanced Artificial Lemming Algorithm and Its Application in UAV Path Planning.一种改进的人工旅鼠算法及其在无人机路径规划中的应用
Biomimetics (Basel). 2025 Jun 6;10(6):377. doi: 10.3390/biomimetics10060377.
2
Obstacle Avoidance Strategy and Path Planning of Medical Automated Guided Vehicles Based on the Bionic Characteristics of Antelope Migration.基于藏羚羊迁徙仿生特性的医疗自动导引车避障策略与路径规划
Biomimetics (Basel). 2025 Feb 26;10(3):142. doi: 10.3390/biomimetics10030142.
3
Autonomous Navigation by Mobile Robot with Sensor Fusion Based on Deep Reinforcement Learning.
基于深度强化学习的传感器融合移动机器人自主导航
Sensors (Basel). 2024 Jun 16;24(12):3895. doi: 10.3390/s24123895.
4
Research on multi-robot collaborative operation in logistics and warehousing using A3C optimized YOLOv5-PPO model.基于A3C优化的YOLOv5-PPO模型的物流仓储多机器人协同作业研究
Front Neurorobot. 2024 Jan 23;17:1329589. doi: 10.3389/fnbot.2023.1329589. eCollection 2023.
5
Transfer Learning in Deep Reinforcement Learning: A Survey.深度强化学习中的迁移学习:一项综述。
IEEE Trans Pattern Anal Mach Intell. 2023 Nov;45(11):13344-13362. doi: 10.1109/TPAMI.2023.3292075. Epub 2023 Oct 3.
6
Modified A-Star Algorithm for Efficient Coverage Path Planning in Tetris Inspired Self-Reconfigurable Robot with Integrated Laser Sensor.基于集成激光传感器的 Tetris 启发式自重构机器人的高效覆盖路径规划的改进 A-Star 算法。
Sensors (Basel). 2018 Aug 7;18(8):2585. doi: 10.3390/s18082585.