• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于强化学习的超启发式方法综述。

A review of reinforcement learning based hyper-heuristics.

作者信息

Li Cuixia, Wei Xiang, Wang Jing, Wang Shuozhe, Zhang Shuyan

机构信息

School of Cyber Science and Engineering, Zhengzhou University, Zhengzhou, Henan, China.

出版信息

PeerJ Comput Sci. 2024 Jun 28;10:e2141. doi: 10.7717/peerj-cs.2141. eCollection 2024.

DOI:10.7717/peerj-cs.2141
PMID:38983203
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11232579/
Abstract

The reinforcement learning based hyper-heuristics (RL-HH) is a popular trend in the field of optimization. RL-HH combines the global search ability of hyper-heuristics (HH) with the learning ability of reinforcement learning (RL). This synergy allows the agent to dynamically adjust its own strategy, leading to a gradual optimization of the solution. Existing researches have shown the effectiveness of RL-HH in solving complex real-world problems. However, a comprehensive introduction and summary of the RL-HH field is still blank. This research reviews currently existing RL-HHs and presents a general framework for RL-HHs. This article categorizes the type of algorithms into two categories: value-based reinforcement learning hyper-heuristics and policy-based reinforcement learning hyper-heuristics. Typical algorithms in each category are summarized and described in detail. Finally, the shortcomings in existing researches on RL-HH and future research directions are discussed.

摘要

基于强化学习的超启发式算法(RL-HH)是优化领域的一个流行趋势。RL-HH将超启发式算法(HH)的全局搜索能力与强化学习(RL)的学习能力相结合。这种协同作用使智能体能够动态调整自身策略,从而逐步优化解决方案。现有研究已表明RL-HH在解决复杂现实世界问题方面的有效性。然而,对RL-HH领域的全面介绍和总结仍然空白。本研究回顾了当前现有的RL-HH,并提出了RL-HH的通用框架。本文将算法类型分为两类:基于值的强化学习超启发式算法和基于策略的强化学习超启发式算法。对每类中的典型算法进行了总结并详细描述。最后,讨论了RL-HH现有研究中的不足和未来的研究方向。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ab6/11232579/77929f20b609/peerj-cs-10-2141-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ab6/11232579/6c194ca99f86/peerj-cs-10-2141-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ab6/11232579/1f67d6147ba9/peerj-cs-10-2141-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ab6/11232579/37076025d10b/peerj-cs-10-2141-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ab6/11232579/96e34ae28b76/peerj-cs-10-2141-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ab6/11232579/77929f20b609/peerj-cs-10-2141-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ab6/11232579/6c194ca99f86/peerj-cs-10-2141-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ab6/11232579/1f67d6147ba9/peerj-cs-10-2141-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ab6/11232579/37076025d10b/peerj-cs-10-2141-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ab6/11232579/96e34ae28b76/peerj-cs-10-2141-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ab6/11232579/77929f20b609/peerj-cs-10-2141-g005.jpg

相似文献

1
A review of reinforcement learning based hyper-heuristics.基于强化学习的超启发式方法综述。
PeerJ Comput Sci. 2024 Jun 28;10:e2141. doi: 10.7717/peerj-cs.2141. eCollection 2024.
2
Simple Hyper-Heuristics Control the Neighbourhood Size of Randomised Local Search Optimally for LeadingOnes.简单超启发式算法可最优地控制 LeadingOnes 的随机局部搜索的邻域大小。
Evol Comput. 2020 Fall;28(3):437-461. doi: 10.1162/evco_a_00258. Epub 2019 May 23.
3
Hyper-heuristics with low level parameter adaptation.具有低层次参数自适应的超启发式算法。
Evol Comput. 2012 Summer;20(2):189-227. doi: 10.1162/EVCO_a_00063. Epub 2012 Feb 24.
4
A Case Study of Controlling Crossover in a Selection Hyper-heuristic Framework Using the Multidimensional Knapsack Problem.使用多维背包问题控制选择超启发式框架中的交叉案例研究。
Evol Comput. 2016 Spring;24(1):113-41. doi: 10.1162/EVCO_a_00145. Epub 2015 Jan 30.
5
Neural Decoders Using Reinforcement Learning in Brain Machine Interfaces: A Technical Review.脑机接口中使用强化学习的神经解码器:技术综述
Front Syst Neurosci. 2022 Aug 26;16:836778. doi: 10.3389/fnsys.2022.836778. eCollection 2022.
6
A Review of Safe Reinforcement Learning: Methods, Theories, and Applications.安全强化学习综述:方法、理论与应用
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):11216-11235. doi: 10.1109/TPAMI.2024.3457538. Epub 2024 Nov 6.
7
On Transforming Reinforcement Learning With Transformers: The Development Trajectory.关于用Transformer架构改造强化学习:发展轨迹
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8580-8599. doi: 10.1109/TPAMI.2024.3408271. Epub 2024 Nov 6.
8
A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems.离线强化学习综述:分类、回顾与开放问题
IEEE Trans Neural Netw Learn Syst. 2024 Aug;35(8):10237-10257. doi: 10.1109/TNNLS.2023.3250269. Epub 2024 Aug 5.
9
Deep reinforcement learning in chemistry: A review.化学中的深度强化学习:综述
J Comput Chem. 2024 Aug 15;45(22):1886-1898. doi: 10.1002/jcc.27354. Epub 2024 May 2.
10
Heuristically-accelerated multiagent reinforcement learning.启发式加速多智能体强化学习。
IEEE Trans Cybern. 2014 Feb;44(2):252-65. doi: 10.1109/TCYB.2013.2253094.

本文引用的文献

1
Drilling Sequence Optimization Using Evolutionary Algorithms to Reduce Heat Accumulation for Femtosecond Laser Drilling with Multi-Spot Beam Profiles.使用进化算法优化钻孔顺序以减少飞秒激光多光斑光束轮廓钻孔中的热量积累
Materials (Basel). 2023 Aug 23;16(17):5775. doi: 10.3390/ma16175775.
2
A Hyperheuristic With Q-Learning for the Multiobjective Energy-Efficient Distributed Blocking Flow Shop Scheduling Problem.一种用于多目标节能分布式阻塞流水车间调度问题的基于Q学习的超启发式算法
IEEE Trans Cybern. 2023 May;53(5):3337-3350. doi: 10.1109/TCYB.2022.3192112. Epub 2023 Apr 21.
3
Q-Learning-Based Hyperheuristic Evolutionary Algorithm for Dynamic Task Allocation of Crowdsensing.
基于Q学习的超启发式进化算法用于群体感知的动态任务分配
IEEE Trans Cybern. 2023 Apr;53(4):2211-2224. doi: 10.1109/TCYB.2021.3112675. Epub 2023 Mar 16.
4
Vector-based navigation using grid-like representations in artificial agents.基于向量的人工代理中使用网格表示的导航。
Nature. 2018 May;557(7705):429-433. doi: 10.1038/s41586-018-0102-6. Epub 2018 May 9.
5
DeepStack: Expert-level artificial intelligence in heads-up no-limit poker.深筹码:单人无限注德州扑克中的专家级人工智能。
Science. 2017 May 5;356(6337):508-513. doi: 10.1126/science.aam6960. Epub 2017 Mar 2.
6
A dynamic multiarmed bandit-gene expression programming hyper-heuristic for combinatorial optimization problems.一种用于组合优化问题的动态多臂赌博机-基因表达式编程超启发式算法。
IEEE Trans Cybern. 2015 Feb;45(2):217-28. doi: 10.1109/TCYB.2014.2323936. Epub 2014 Jun 2.