• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于启发式强化学习的无人机机动决策方法研究。

Research on UCAV Maneuvering Decision Method Based on Heuristic Reinforcement Learning.

机构信息

Air Force Engineering University, Xi'an, China.

Southeast University, Nanjing, China.

出版信息

Comput Intell Neurosci. 2022 Mar 3;2022:1477078. doi: 10.1155/2022/1477078. eCollection 2022.

DOI:10.1155/2022/1477078
PMID:35281202
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8913150/
Abstract

With the rapid development of unmanned combat aerial vehicle (UCAV)-related technologies, UCAVs are playing an increasingly important role in military operations. It has become an inevitable trend in the development of future air combat battlefields that UCAVs complete air combat tasks independently to acquire air superiority. In this paper, the UCAV maneuver decision problem in continuous action space is studied based on the deep reinforcement learning strategy optimization method. The UCAV platform model of continuous action space was established. Focusing on the problem of insufficient exploration ability of Ornstein-Uhlenbeck (OU) exploration strategy in the deep deterministic policy gradient (DDPG) algorithm, a heuristic DDPG algorithm was proposed by introducing heuristic exploration strategy, and then a UCAV air combat maneuver decision method based on a heuristic DDPG algorithm is proposed. The superior performance of the algorithm is verified by comparison with different algorithms in the test environment, and the effectiveness of the decision method is verified by simulation of air combat tasks with different difficulty and attack modes.

摘要

随着无人机(UCAV)相关技术的快速发展,UCAV 在军事行动中发挥着越来越重要的作用。UCAV 独立完成空战任务以获得空中优势,这已经成为未来空战战场发展的必然趋势。本文基于深度强化学习策略优化方法研究了连续动作空间中的 UCAV 机动决策问题。建立了连续动作空间的 UCAV 平台模型。针对深度确定性策略梯度(DDPG)算法中 Ornstein-Uhlenbeck(OU)探索策略探索能力不足的问题,通过引入启发式探索策略,提出了一种启发式 DDPG 算法,然后提出了一种基于启发式 DDPG 算法的 UCAV 空战机动决策方法。通过在测试环境中与不同算法进行比较,验证了算法的优越性能,并通过模拟不同难度和攻击模式的空战任务验证了决策方法的有效性。

相似文献

1
Research on UCAV Maneuvering Decision Method Based on Heuristic Reinforcement Learning.基于启发式强化学习的无人机机动决策方法研究。
Comput Intell Neurosci. 2022 Mar 3;2022:1477078. doi: 10.1155/2022/1477078. eCollection 2022.
2
Hypovigilance detection for UCAV operators based on a hidden Markov model.基于隐马尔可夫模型的无人机操作员低警觉检测
Comput Math Methods Med. 2014;2014:567645. doi: 10.1155/2014/567645. Epub 2014 May 20.
3
UAV Autonomous Tracking and Landing Based on Deep Reinforcement Learning Strategy.基于深度强化学习策略的无人机自主跟踪与着陆
Sensors (Basel). 2020 Oct 1;20(19):5630. doi: 10.3390/s20195630.
4
A bat algorithm with mutation for UCAV path planning.一种用于无人作战飞机路径规划的带变异的蝙蝠算法。
ScientificWorldJournal. 2012;2012:418946. doi: 10.1100/2012/418946. Epub 2012 Dec 27.
5
Universal Adaptive Neural Network Predictive Algorithm for Remotely Piloted Unmanned Combat Aerial Vehicle in Wireless Sensor Network.用于无线传感器网络中远程控制无人机的通用自适应神经网络预测算法。
Sensors (Basel). 2020 Apr 14;20(8):2213. doi: 10.3390/s20082213.
6
An improved artificial bee colony algorithm based on balance-evolution strategy for unmanned combat aerial vehicle path planning.一种基于平衡进化策略的改进人工蜂群算法用于无人机路径规划
ScientificWorldJournal. 2014 Mar 20;2014:232704. doi: 10.1155/2014/232704. eCollection 2014.
7
Path planning optimization in unmanned aerial vehicles using meta-heuristic algorithms: a systematic review.使用启发式算法的无人机路径规划优化:系统评价。
Environ Monit Assess. 2022 Oct 25;195(1):30. doi: 10.1007/s10661-022-10590-y.
8
A Novel Teaching-Learning-Based Optimization with Error Correction and Cauchy Distribution for Path Planning of Unmanned Air Vehicle.一种基于误差校正和柯西分布的新型教与学优化算法在无人机路径规划中的应用。
Comput Intell Neurosci. 2018 Aug 1;2018:5671709. doi: 10.1155/2018/5671709. eCollection 2018.
9
A hybrid metaheuristic DE/CS algorithm for UCAV three-dimension path planning.一种用于无人机三维路径规划的混合元启发式差分进化/克隆选择算法。
ScientificWorldJournal. 2012;2012:583973. doi: 10.1100/2012/583973. Epub 2012 Oct 21.
10
UAVs Maneuver Decision-Making Method Based on Transfer Reinforcement Learning.基于迁移强化学习的无人机机动决策方法。
Comput Intell Neurosci. 2022 Nov 14;2022:2399796. doi: 10.1155/2022/2399796. eCollection 2022.

引用本文的文献

1
Intelligent maneuver decision-making for UAVs using the TD3-LSTM reinforcement learning algorithm under uncertain information.不确定信息下基于TD3-LSTM强化学习算法的无人机智能机动决策
Front Robot AI. 2025 Aug 1;12:1645927. doi: 10.3389/frobt.2025.1645927. eCollection 2025.
2
Network Architecture for Optimizing Deep Deterministic Policy Gradient Algorithms.用于优化深度确定性策略梯度算法的网络架构。
Comput Intell Neurosci. 2022 Nov 18;2022:1117781. doi: 10.1155/2022/1117781. eCollection 2022.

本文引用的文献

1
Mastering the game of Go without human knowledge.无需人类知识即可掌握围棋游戏。
Nature. 2017 Oct 18;550(7676):354-359. doi: 10.1038/nature24270.
2
Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。
Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.
3
Kernel-based least squares policy iteration for reinforcement learning.用于强化学习的基于核的最小二乘策略迭代
IEEE Trans Neural Netw. 2007 Jul;18(4):973-92. doi: 10.1109/TNN.2007.899161.