• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于深度强化学习的无人机基站搜索灾难受害者位置信息路径规划研究

Path Planning Research of a UAV Base Station Searching for Disaster Victims' Location Information Based on Deep Reinforcement Learning.

作者信息

Zhao Jinduo, Gan Zhigao, Liang Jiakai, Wang Chao, Yue Keqiang, Li Wenjun, Li Yilin, Li Ruixue

机构信息

Zhejiang Integrated Circuits and Intelligent Hardware Collaborative Innovation Center, Hangzhou Dianzi University, Hangzhou 310018, China.

出版信息

Entropy (Basel). 2022 Dec 2;24(12):1767. doi: 10.3390/e24121767.

DOI:10.3390/e24121767
PMID:36554172
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9778616/
Abstract

Aiming at the path planning problem of unmanned aerial vehicle (UAV) base stations when performing search tasks, this paper proposes a Double DQN-state splitting Q network (DDQN-SSQN) algorithm that combines state splitting and optimal state to complete the optimal path planning of UAV based on the Deep Reinforcement Learning DDQN algorithm. The method stores multidimensional state information in categories and uses targeted training to obtain optimal path information. The method also references the received signal strength indicator (RSSI) to influence the reward received by the agent, and in this way reduces the decision difficulty of the UAV. In order to simulate the scenarios of UAVs in real work, this paper uses the Open AI Gym simulation platform to construct a mission system model. The simulation results show that the proposed scheme can plan the optimal path faster than other traditional algorithmic schemes and has a greater advantage in the stability and convergence speed of the algorithm.

摘要

针对无人机基站执行搜索任务时的路径规划问题,本文提出了一种结合状态分裂和最优状态的双深度Q网络(DDQN)-状态分裂Q网络(DDQN-SSQN)算法,以基于深度强化学习DDQN算法完成无人机的最优路径规划。该方法将多维状态信息进行分类存储,并通过针对性训练获取最优路径信息。该方法还参考接收信号强度指示(RSSI)来影响智能体获得的奖励,以此降低无人机的决策难度。为模拟无人机在实际工作中的场景,本文使用Open AI Gym仿真平台构建任务系统模型。仿真结果表明,所提方案能够比其他传统算法方案更快地规划出最优路径,且在算法稳定性和收敛速度方面具有更大优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/faec8a970862/entropy-24-01767-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/8895885d4230/entropy-24-01767-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/9a81cc0e7a89/entropy-24-01767-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/0aed7fa5a6a3/entropy-24-01767-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/77a9817da0b5/entropy-24-01767-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/8a3f9a46e816/entropy-24-01767-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/b1785248e7ad/entropy-24-01767-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/e111aaf4a145/entropy-24-01767-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/9a4b8b1741e9/entropy-24-01767-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/22eb4dc4eaac/entropy-24-01767-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/4387241e2e9b/entropy-24-01767-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/446025eca50e/entropy-24-01767-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/faec8a970862/entropy-24-01767-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/8895885d4230/entropy-24-01767-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/9a81cc0e7a89/entropy-24-01767-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/0aed7fa5a6a3/entropy-24-01767-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/77a9817da0b5/entropy-24-01767-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/8a3f9a46e816/entropy-24-01767-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/b1785248e7ad/entropy-24-01767-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/e111aaf4a145/entropy-24-01767-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/9a4b8b1741e9/entropy-24-01767-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/22eb4dc4eaac/entropy-24-01767-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/4387241e2e9b/entropy-24-01767-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/446025eca50e/entropy-24-01767-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ab3/9778616/faec8a970862/entropy-24-01767-g012.jpg

相似文献

1
Path Planning Research of a UAV Base Station Searching for Disaster Victims' Location Information Based on Deep Reinforcement Learning.基于深度强化学习的无人机基站搜索灾难受害者位置信息路径规划研究
Entropy (Basel). 2022 Dec 2;24(12):1767. doi: 10.3390/e24121767.
2
Proactive Handover Decision for UAVs with Deep Reinforcement Learning.基于深度强化学习的无人机主动交接决策
Sensors (Basel). 2022 Feb 5;22(3):1200. doi: 10.3390/s22031200.
3
Multi-UAV Path Planning in GPS and Communication Denial Environment.多无人机在 GPS 和通信干扰环境下的路径规划。
Sensors (Basel). 2023 Mar 10;23(6):2997. doi: 10.3390/s23062997.
4
Trajectory optimization of UAV-IRS assisted 6G THz network using deep reinforcement learning approach.基于深度强化学习方法的无人机-智能反射面辅助6G太赫兹网络轨迹优化
Sci Rep. 2024 Aug 9;14(1):18501. doi: 10.1038/s41598-024-68459-8.
5
Multi-UAV simultaneous target assignment and path planning based on deep reinforcement learning in dynamic multiple obstacles environments.动态多障碍物环境下基于深度强化学习的多无人机同步目标分配与路径规划
Front Neurorobot. 2024 Jan 22;17:1302898. doi: 10.3389/fnbot.2023.1302898. eCollection 2023.
6
Real-time route planning of unmanned aerial vehicles based on improved soft actor-critic algorithm.基于改进软演员-评论家算法的无人机实时路径规划
Front Neurorobot. 2022 Dec 5;16:1025817. doi: 10.3389/fnbot.2022.1025817. eCollection 2022.
7
UAV Path Planning Algorithm Based on Improved Harris Hawks Optimization.基于改进哈里斯鹰优化算法的无人机路径规划
Sensors (Basel). 2022 Jul 13;22(14):5232. doi: 10.3390/s22145232.
8
Deep Reinforcement Learning for UAV Trajectory Design Considering Mobile Ground Users.考虑移动地面用户的无人机轨迹设计的深度强化学习
Sensors (Basel). 2021 Dec 9;21(24):8239. doi: 10.3390/s21248239.
9
A UAV Maneuver Decision-Making Algorithm for Autonomous Airdrop Based on Deep Reinforcement Learning.一种基于深度强化学习的无人机自主空投机动决策算法
Sensors (Basel). 2021 Mar 23;21(6):2233. doi: 10.3390/s21062233.
10
Enhancing Stability and Performance in Mobile Robot Path Planning with PMR-Dueling DQN Algorithm.基于PMR-决斗深度Q网络算法提升移动机器人路径规划的稳定性与性能
Sensors (Basel). 2024 Feb 27;24(5):1523. doi: 10.3390/s24051523.

引用本文的文献

1
Path planning of mobile robot based on improved double deep Q-network algorithm.基于改进双深度Q网络算法的移动机器人路径规划
Front Neurorobot. 2025 Feb 13;19:1512953. doi: 10.3389/fnbot.2025.1512953. eCollection 2025.