Suppr超能文献

基于深度强化学习的无人机基站搜索灾难受害者位置信息路径规划研究

Path Planning Research of a UAV Base Station Searching for Disaster Victims' Location Information Based on Deep Reinforcement Learning.

作者信息

Zhao Jinduo, Gan Zhigao, Liang Jiakai, Wang Chao, Yue Keqiang, Li Wenjun, Li Yilin, Li Ruixue

机构信息

Zhejiang Integrated Circuits and Intelligent Hardware Collaborative Innovation Center, Hangzhou Dianzi University, Hangzhou 310018, China.

出版信息

Entropy (Basel). 2022 Dec 2;24(12):1767. doi: 10.3390/e24121767.

Abstract

Aiming at the path planning problem of unmanned aerial vehicle (UAV) base stations when performing search tasks, this paper proposes a Double DQN-state splitting Q network (DDQN-SSQN) algorithm that combines state splitting and optimal state to complete the optimal path planning of UAV based on the Deep Reinforcement Learning DDQN algorithm. The method stores multidimensional state information in categories and uses targeted training to obtain optimal path information. The method also references the received signal strength indicator (RSSI) to influence the reward received by the agent, and in this way reduces the decision difficulty of the UAV. In order to simulate the scenarios of UAVs in real work, this paper uses the Open AI Gym simulation platform to construct a mission system model. The simulation results show that the proposed scheme can plan the optimal path faster than other traditional algorithmic schemes and has a greater advantage in the stability and convergence speed of the algorithm.

摘要

针对无人机基站执行搜索任务时的路径规划问题,本文提出了一种结合状态分裂和最优状态的双深度Q网络(DDQN)-状态分裂Q网络(DDQN-SSQN)算法,以基于深度强化学习DDQN算法完成无人机的最优路径规划。该方法将多维状态信息进行分类存储,并通过针对性训练获取最优路径信息。该方法还参考接收信号强度指示(RSSI)来影响智能体获得的奖励,以此降低无人机的决策难度。为模拟无人机在实际工作中的场景,本文使用Open AI Gym仿真平台构建任务系统模型。仿真结果表明,所提方案能够比其他传统算法方案更快地规划出最优路径,且在算法稳定性和收敛速度方面具有更大优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/8895885d4230/entropy-24-01767-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验