• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于局部注意力的深度强化学习求解单颗敏捷光学卫星调度问题

Deep Reinforcement Learning with Local Attention for Single Agile Optical Satellite Scheduling Problem.

作者信息

Liu Zheng, Xiong Wei, Han Chi, Yu Xiaolan

机构信息

National Key Laboratory of Space Target Awareness, Space Engineering University, Beijing 101416, China.

出版信息

Sensors (Basel). 2024 Oct 2;24(19):6396. doi: 10.3390/s24196396.

DOI:10.3390/s24196396
PMID:39409435
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11479382/
Abstract

This paper investigates the single agile optical satellite scheduling problem, which has received increasing attention due to the rapid growth in earth observation requirements. Owing to the complicated constraints and considerable solution space of this problem, the conventional exact methods and heuristic methods, which are sensitive to the problem scale, demand high computational expenses. Thus, an efficient approach is demanded to solve this problem, and this paper proposes a deep reinforcement learning algorithm with a local attention mechanism. A mathematical model is first established to describe this problem, which considers a series of complex constraints and takes the profit ratio of completed tasks as the optimization objective. Then, a neural network framework with an encoder-decoder structure is adopted to generate high-quality solutions, and a local attention mechanism is designed to improve the generation of solutions. In addition, an adaptive learning rate strategy is proposed to guide the actor-critic training algorithm to dynamically adjust the learning rate in the training process to enhance the training effectiveness of the proposed network. Finally, extensive experiments verify that the proposed algorithm outperforms the comparison algorithms in terms of solution quality, generalization performance, and computation efficiency.

摘要

本文研究了单颗敏捷光学卫星调度问题,由于对地观测需求的快速增长,该问题受到了越来越多的关注。由于该问题具有复杂的约束条件和相当大的解空间,传统的精确方法和启发式方法对问题规模敏感,需要高昂的计算成本。因此,需要一种有效的方法来解决这个问题,本文提出了一种具有局部注意力机制的深度强化学习算法。首先建立一个数学模型来描述这个问题,该模型考虑了一系列复杂的约束条件,并以完成任务的利润率作为优化目标。然后,采用具有编码器-解码器结构的神经网络框架来生成高质量的解,并设计了一种局部注意力机制来改进解的生成。此外,还提出了一种自适应学习率策略,以指导actor-critic训练算法在训练过程中动态调整学习率,提高所提网络的训练效果。最后,大量实验验证了所提算法在解质量、泛化性能和计算效率方面优于比较算法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/be3a0bcaf753/sensors-24-06396-g010a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/eb87a82d6987/sensors-24-06396-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/6e3050b5a5d0/sensors-24-06396-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/ff7567aa4cae/sensors-24-06396-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/02d3dea45bf1/sensors-24-06396-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/94179d4c9223/sensors-24-06396-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/0018d18b8fcb/sensors-24-06396-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/b6f496032258/sensors-24-06396-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/af8304dfd474/sensors-24-06396-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/a38bf8c3961b/sensors-24-06396-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/be3a0bcaf753/sensors-24-06396-g010a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/eb87a82d6987/sensors-24-06396-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/6e3050b5a5d0/sensors-24-06396-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/ff7567aa4cae/sensors-24-06396-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/02d3dea45bf1/sensors-24-06396-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/94179d4c9223/sensors-24-06396-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/0018d18b8fcb/sensors-24-06396-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/b6f496032258/sensors-24-06396-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/af8304dfd474/sensors-24-06396-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/a38bf8c3961b/sensors-24-06396-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1157/11479382/be3a0bcaf753/sensors-24-06396-g010a.jpg

相似文献

1
Deep Reinforcement Learning with Local Attention for Single Agile Optical Satellite Scheduling Problem.基于局部注意力的深度强化学习求解单颗敏捷光学卫星调度问题
Sensors (Basel). 2024 Oct 2;24(19):6396. doi: 10.3390/s24196396.
2
Multi-Adaptive Strategies-Based Higher-Order Quantum Genetic Algorithm for Agile Remote Sensing Satellite Scheduling Problem.基于多自适应策略的高阶量子遗传算法求解敏捷遥感卫星调度问题
Sensors (Basel). 2024 Jul 30;24(15):4938. doi: 10.3390/s24154938.
3
Task Offloading Decision-Making Algorithm for Vehicular Edge Computing: A Deep-Reinforcement-Learning-Based Approach.车载边缘计算的任务卸载决策算法:一种基于深度强化学习的方法。
Sensors (Basel). 2023 Sep 1;23(17):7595. doi: 10.3390/s23177595.
4
Intelligent Decision-Making of Scheduling for Dynamic Permutation Flowshop via Deep Reinforcement Learning.基于深度强化学习的动态置换流水车间调度智能决策
Sensors (Basel). 2021 Feb 2;21(3):1019. doi: 10.3390/s21031019.
5
A deep reinforcement learning algorithm for the rectangular strip packing problem.一种用于矩形带材打包问题的深度强化学习算法。
PLoS One. 2023 Mar 16;18(3):e0282598. doi: 10.1371/journal.pone.0282598. eCollection 2023.
6
An actor-critic framework based on deep reinforcement learning for addressing flexible job shop scheduling problems.一种基于深度强化学习的演员-评论家框架,用于解决柔性作业车间调度问题。
Math Biosci Eng. 2024 Jan;21(1):1445-1471. doi: 10.3934/mbe.2024062. Epub 2022 Dec 28.
7
Deep Reinforcement Learning Microgrid Optimization Strategy Considering Priority Flexible Demand Side.考虑优先级灵活需求侧的深度强化学习微电网优化策略
Sensors (Basel). 2022 Mar 14;22(6):2256. doi: 10.3390/s22062256.
8
A priority experience replay actor-critic algorithm using self-attention mechanism for strategy optimization of discrete problems.一种使用自注意力机制的优先经验回放演员-评论家算法,用于离散问题的策略优化。
PeerJ Comput Sci. 2024 Jun 28;10:e2161. doi: 10.7717/peerj-cs.2161. eCollection 2024.
9
A deep reinforcement learning algorithm framework for solving multi-objective traveling salesman problem based on feature transformation.基于特征变换的求解多目标旅行商问题的深度强化学习算法框架。
Neural Netw. 2024 Aug;176:106359. doi: 10.1016/j.neunet.2024.106359. Epub 2024 May 3.
10
Deep reinforcement learning algorithm for solving material emergency dispatching problem.深度强化学习算法在解决物资紧急调度问题中的应用。
Math Biosci Eng. 2022 Aug 1;19(11):10864-10881. doi: 10.3934/mbe.2022508.

引用本文的文献

1
Mission Sequence Model and Deep Reinforcement Learning-Based Replanning Method for Multi-Satellite Observation.基于任务序列模型和深度强化学习的多卫星观测重规划方法
Sensors (Basel). 2025 Mar 10;25(6):1707. doi: 10.3390/s25061707.

本文引用的文献

1
Direct electromagnetic information processing with planar diffractive neural network.利用平面衍射神经网络进行直接电磁信息处理。
Sci Adv. 2024 Jul 19;10(29):eado3937. doi: 10.1126/sciadv.ado3937.