Wang Suyu, Xu Zhenlei, Qiao Peihong, Yue Quan, Ke Ya, Gao Feng
School of Mechanical and Electrical Engineering, China University of Mining and Technology, Beijing 100083, China.
Institute of Intelligent Mining and Robotics, China University of Mining and Technology, Beijing 100083, China.
Biomimetics (Basel). 2025 Jul 25;10(8):493. doi: 10.3390/biomimetics10080493.
In nature, organisms often rely on the integration of local sensory information and prior experience to flexibly adapt to complex and dynamic environments, enabling efficient path selection. This bio-inspired mechanism of perception and behavioral adjustment provides important insights for path planning in mobile robots operating under uncertainty. In recent years, the introduction of deep reinforcement learning (DRL) has empowered mobile robots to autonomously learn navigation strategies through interaction with the environment, allowing them to identify obstacle distributions and perform path planning even in unknown scenarios. To further enhance the adaptability and path planning performance of robots in complex environments, this paper develops a deep reinforcement learning framework based on the Soft Actor-Critic (SAC) algorithm. First, to address the limited adaptability of existing transfer learning methods, we propose an action-level fusion mechanism that dynamically integrates prior and current policies during inference, enabling more flexible knowledge transfer. Second, a bio-inspired radar perception optimization method is introduced, which mimics the biological mechanism of focusing on key regions while ignoring redundant information, thereby enhancing the expressiveness of sensory inputs. Finally, a reward function based on ineffective behavior recognition is designed to reduce unnecessary exploration during training. The proposed method is validated in both the Gazebo simulation environment and real-world scenarios. Experimental results demonstrate that the approach achieves faster convergence and superior obstacle avoidance performance in path planning tasks, exhibiting strong transferability and generalization across various obstacle configurations.
在自然界中,生物体常常依靠整合局部感官信息和先前经验来灵活适应复杂多变的环境,从而实现高效的路径选择。这种受生物启发的感知和行为调整机制为在不确定性环境下运行的移动机器人的路径规划提供了重要的见解。近年来,深度强化学习(DRL)的引入使移动机器人能够通过与环境交互自主学习导航策略,使其即使在未知场景中也能识别障碍物分布并进行路径规划。为了进一步提高机器人在复杂环境中的适应性和路径规划性能,本文基于软演员-评论家(SAC)算法开发了一个深度强化学习框架。首先,为了解决现有迁移学习方法适应性有限的问题,我们提出了一种动作级融合机制,该机制在推理过程中动态整合先前和当前策略,实现更灵活的知识迁移。其次,引入了一种受生物启发的雷达感知优化方法,该方法模仿了关注关键区域而忽略冗余信息的生物机制,从而增强了感官输入的表现力。最后,设计了一种基于无效行为识别的奖励函数,以减少训练过程中的不必要探索。所提出的方法在Gazebo仿真环境和实际场景中均得到了验证。实验结果表明,该方法在路径规划任务中实现了更快的收敛速度和卓越的避障性能,在各种障碍物配置下均表现出强大的可迁移性和泛化能力。