Lin Yunhan, Zhang Zhijie, Tan Yijian, Fu Hao, Min Huasong
School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, 430081, China.
Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System, Wuhan University of Science and Technology, Wuhan, 430081, China.
Sci Rep. 2025 May 26;15(1):18331. doi: 10.1038/s41598-025-02244-z.
To address the challenges of sample utilization efficiency and managing temporal dependencies, this paper proposes an efficient path planning method for mobile robot in dynamic environments based on an improved twin delayed deep deterministic policy gradient (TD3) algorithm. The proposed method, named PL-TD3, integrates prioritized experience replay (PER) and long short-term memory (LSTM) neural networks, which enhance both sample efficiency and the ability to handle time-series data. To verify the effectiveness of the proposed method, simulation and practical experiments were designed and conducted. In the simulation experiments, both static and dynamic obstacles were included in the test environment, along with experiments to assess generalization capabilities. The algorithm demonstrated superior performance in terms of both execution time and path efficiency. The practical experiments, based on the assumptions from the simulation tests, further confirmed that PL-TD3 has improved the effectiveness and robustness of path planning for mobile robot in dynamic environments.
为应对样本利用效率和管理时间依赖性的挑战,本文提出了一种基于改进的双延迟深度确定性策略梯度(TD3)算法的动态环境下移动机器人高效路径规划方法。所提出的方法名为PL-TD3,它集成了优先经验回放(PER)和长短期记忆(LSTM)神经网络,这提高了样本效率和处理时间序列数据的能力。为验证所提方法的有效性,设计并进行了仿真和实际实验。在仿真实验中,测试环境中包含了静态和动态障碍物,同时还进行了评估泛化能力的实验。该算法在执行时间和路径效率方面均表现出卓越性能。基于仿真测试的假设进行的实际实验进一步证实,PL-TD3提高了动态环境下移动机器人路径规划的有效性和鲁棒性。