• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于空战战术追击地理协调的分层强化学习框架

Hierarchical Reinforcement Learning Framework in Geographic Coordination for Air Combat Tactical Pursuit.

作者信息

Chen Ruihai, Li Hao, Yan Guanwei, Peng Haojie, Zhang Qian

机构信息

School of Aeronautics, Northwestern Polytechnical University, Xi'an 710072, China.

Chengdu Aircraft Design and Research Institute, Chengdu 610041, China.

出版信息

Entropy (Basel). 2023 Oct 1;25(10):1409. doi: 10.3390/e25101409.

DOI:10.3390/e25101409
PMID:37895530
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10606649/
Abstract

This paper proposes an air combat training framework based on hierarchical reinforcement learning to address the problem of non-convergence in training due to the curse of dimensionality caused by the large state space during air combat tactical pursuit. Using hierarchical reinforcement learning, three-dimensional problems can be transformed into two-dimensional problems, improving training performance compared to other baselines. To further improve the overall learning performance, a meta-learning-based algorithm is established, and the corresponding reward function is designed to further improve the performance of the agent in the air combat tactical chase scenario. The results show that the proposed framework can achieve better performance than the baseline approach.

摘要

本文提出了一种基于分层强化学习的空战训练框架,以解决空战战术追击过程中由于状态空间过大导致的维度灾难所引起的训练不收敛问题。利用分层强化学习,三维问题可以转化为二维问题,与其他基线方法相比,提高了训练性能。为了进一步提高整体学习性能,建立了一种基于元学习的算法,并设计了相应的奖励函数,以进一步提高智能体在空中战战术追击场景中的性能。结果表明,所提出的框架能够比基线方法取得更好的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/ee53500d1ebb/entropy-25-01409-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/2c7cd0a54680/entropy-25-01409-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/b12accaec7d1/entropy-25-01409-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/d71d17e5cd4f/entropy-25-01409-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/1a162e6606ef/entropy-25-01409-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/1d50d0232d1a/entropy-25-01409-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/937a9d3e386b/entropy-25-01409-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/2a0384444b55/entropy-25-01409-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/3f1b3d8f24fd/entropy-25-01409-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/ea1b7bd86ee9/entropy-25-01409-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/ee53500d1ebb/entropy-25-01409-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/2c7cd0a54680/entropy-25-01409-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/b12accaec7d1/entropy-25-01409-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/d71d17e5cd4f/entropy-25-01409-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/1a162e6606ef/entropy-25-01409-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/1d50d0232d1a/entropy-25-01409-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/937a9d3e386b/entropy-25-01409-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/2a0384444b55/entropy-25-01409-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/3f1b3d8f24fd/entropy-25-01409-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/ea1b7bd86ee9/entropy-25-01409-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4c6/10606649/ee53500d1ebb/entropy-25-01409-g012.jpg

相似文献

1
Hierarchical Reinforcement Learning Framework in Geographic Coordination for Air Combat Tactical Pursuit.用于空战战术追击地理协调的分层强化学习框架
Entropy (Basel). 2023 Oct 1;25(10):1409. doi: 10.3390/e25101409.
2
Enhancing multi-UAV air combat decision making via hierarchical reinforcement learning.通过分层强化学习增强多无人机空战决策
Sci Rep. 2024 Feb 23;14(1):4458. doi: 10.1038/s41598-024-54938-5.
3
A reward optimization method based on action subrewards in hierarchical reinforcement learning.一种基于分层强化学习中动作子奖励的奖励优化方法。
ScientificWorldJournal. 2014 Jan 28;2014:120760. doi: 10.1155/2014/120760. eCollection 2014.
4
An Improved Approach towards Multi-Agent Pursuit-Evasion Game Decision-Making Using Deep Reinforcement Learning.一种使用深度强化学习改进多智能体追逃博弈决策的方法。
Entropy (Basel). 2021 Oct 29;23(11):1433. doi: 10.3390/e23111433.
5
Improved Robot Path Planning Method Based on Deep Reinforcement Learning.基于深度强化学习的改进型机器人路径规划方法。
Sensors (Basel). 2023 Jun 15;23(12):5622. doi: 10.3390/s23125622.
6
A reinforcement learning algorithm acquires demonstration from the training agent by dividing the task space.强化学习算法通过划分任务空间从训练代理那里获取演示。
Neural Netw. 2023 Jul;164:419-427. doi: 10.1016/j.neunet.2023.04.042. Epub 2023 May 5.
7
Intelligent air defense task assignment based on hierarchical reinforcement learning.基于分层强化学习的智能防空任务分配
Front Neurorobot. 2022 Dec 1;16:1072887. doi: 10.3389/fnbot.2022.1072887. eCollection 2022.
8
PaCAR: COVID-19 Pandemic Control Decision Making via Large-Scale Agent-Based Modeling and Deep Reinforcement Learning.PaCAR:通过大规模基于代理的建模和深度强化学习进行 COVID-19 大流行控制决策。
Med Decis Making. 2022 Nov;42(8):1064-1077. doi: 10.1177/0272989X221107902. Epub 2022 Jul 1.
9
Boosting Reinforcement Learning via Hierarchical Game Playing With State Relay.通过带有状态中继的分层博弈来增强强化学习
IEEE Trans Neural Netw Learn Syst. 2025 Apr;36(4):7077-7089. doi: 10.1109/TNNLS.2024.3386717. Epub 2025 Apr 4.
10
Intelligent multiagent coordination based on reinforcement hierarchical neuro-fuzzy models.基于强化分层神经模糊模型的智能多智能体协调
Int J Neural Syst. 2014 Dec;24(8):1450031. doi: 10.1142/S0129065714500312. Epub 2014 Nov 18.

本文引用的文献

1
Multi Pseudo Q-Learning-Based Deterministic Policy Gradient for Tracking Control of Autonomous Underwater Vehicles.基于多伪Q学习的确定性策略梯度自主水下航行器跟踪控制方法
IEEE Trans Neural Netw Learn Syst. 2019 Dec;30(12):3534-3546. doi: 10.1109/TNNLS.2018.2884797. Epub 2018 Dec 28.
2
Mastering the game of Go without human knowledge.无需人类知识即可掌握围棋游戏。
Nature. 2017 Oct 18;550(7676):354-359. doi: 10.1038/nature24270.