• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

离线强化学习综述:分类、回顾与开放问题

A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems.

作者信息

Figueiredo Prudencio Rafael, Maximo Marcos R O A, Colombini Esther Luna

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Aug;35(8):10237-10257. doi: 10.1109/TNNLS.2023.3250269. Epub 2024 Aug 5.

DOI:10.1109/TNNLS.2023.3250269
PMID:37030754
Abstract

With the widespread adoption of deep learning, reinforcement learning (RL) has experienced a dramatic increase in popularity, scaling to previously intractable problems, such as playing complex games from pixel observations, sustaining conversations with humans, and controlling robotic agents. However, there is still a wide range of domains inaccessible to RL due to the high cost and danger of interacting with the environment. Offline RL is a paradigm that learns exclusively from static datasets of previously collected interactions, making it feasible to extract policies from large and diverse training datasets. Effective offline RL algorithms have a much wider range of applications than online RL, being particularly appealing for real-world applications, such as education, healthcare, and robotics. In this work, we contribute with a unifying taxonomy to classify offline RL methods. Furthermore, we provide a comprehensive review of the latest algorithmic breakthroughs in the field using a unified notation as well as a review of existing benchmarks' properties and shortcomings. Additionally, we provide a figure that summarizes the performance of each method and class of methods on different dataset properties, equipping researchers with the tools to decide which type of algorithm is best suited for the problem at hand and identify which classes of algorithms look the most promising. Finally, we provide our perspective on open problems and propose future research directions for this rapidly growing field.

摘要

随着深度学习的广泛应用,强化学习(RL)的受欢迎程度急剧上升,能够处理以前难以解决的问题,例如从像素观测中玩复杂游戏、与人类持续对话以及控制机器人代理。然而,由于与环境交互的高成本和危险性,RL 仍有许多领域无法涉足。离线 RL 是一种仅从先前收集的交互的静态数据集中进行学习的范式,这使得从大型多样的训练数据集中提取策略变得可行。有效的离线 RL 算法比在线 RL 具有更广泛的应用范围,对于诸如教育、医疗保健和机器人技术等实际应用尤其具有吸引力。在这项工作中,我们提出了一种统一的分类法来对离线 RL 方法进行分类。此外,我们使用统一的符号对该领域的最新算法突破进行了全面综述,并对现有基准测试的属性和缺点进行了综述。此外,我们提供了一个图表,总结了每种方法和方法类别在不同数据集属性上的性能,为研究人员提供工具,以决定哪种类型的算法最适合手头的问题,并确定哪些类别的算法看起来最有前途。最后,我们阐述了对开放问题的看法,并为这个快速发展的领域提出了未来的研究方向。

相似文献

1
A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems.离线强化学习综述:分类、回顾与开放问题
IEEE Trans Neural Netw Learn Syst. 2024 Aug;35(8):10237-10257. doi: 10.1109/TNNLS.2023.3250269. Epub 2024 Aug 5.
2
A Review of Safe Reinforcement Learning: Methods, Theories, and Applications.安全强化学习综述:方法、理论与应用
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):11216-11235. doi: 10.1109/TPAMI.2024.3457538. Epub 2024 Nov 6.
3
On Transforming Reinforcement Learning With Transformers: The Development Trajectory.关于用Transformer架构改造强化学习:发展轨迹
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8580-8599. doi: 10.1109/TPAMI.2024.3408271. Epub 2024 Nov 6.
4
Reinforcement Learning Algorithms and Applications in Healthcare and Robotics: A Comprehensive and Systematic Review.强化学习算法及其在医疗保健与机器人技术中的应用:一项全面且系统的综述
Sensors (Basel). 2024 Apr 11;24(8):2461. doi: 10.3390/s24082461.
5
A review of reinforcement learning based hyper-heuristics.基于强化学习的超启发式方法综述。
PeerJ Comput Sci. 2024 Jun 28;10:e2141. doi: 10.7717/peerj-cs.2141. eCollection 2024.
6
Improving Offline Reinforcement Learning With In-Sample Advantage Regularization for Robot Manipulation.通过样本内优势正则化改进用于机器人操作的离线强化学习
IEEE Trans Neural Netw Learn Syst. 2024 Sep 20;PP. doi: 10.1109/TNNLS.2024.3443102.
7
Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain.深度强化学习探索:从单智能体到多智能体领域
IEEE Trans Neural Netw Learn Syst. 2024 Jul;35(7):8762-8782. doi: 10.1109/TNNLS.2023.3236361. Epub 2024 Jul 8.
8
Offline reinforcement learning for safer blood glucose control in people with type 1 diabetes.1 型糖尿病患者更安全的血糖控制的离线强化学习。
J Biomed Inform. 2023 Jun;142:104376. doi: 10.1016/j.jbi.2023.104376. Epub 2023 May 4.
9
Reinforcement Learning for Mobile Robotics Exploration: A Survey.用于移动机器人探索的强化学习:一项综述。
IEEE Trans Neural Netw Learn Syst. 2023 Aug;34(8):3796-3810. doi: 10.1109/TNNLS.2021.3124466. Epub 2023 Aug 4.
10
Bi-DexHands: Towards Human-Level Bimanual Dexterous Manipulation.双臂双自由度灵巧操作机器人:迈向人类级别的双手灵巧操作
IEEE Trans Pattern Anal Mach Intell. 2024 May;46(5):2804-2818. doi: 10.1109/TPAMI.2023.3339515. Epub 2024 Apr 3.

引用本文的文献

1
Offline reinforcement learning for learning to dispatch for job shop scheduling.用于学习作业车间调度调度的离线强化学习。
Mach Learn. 2025;114(8):191. doi: 10.1007/s10994-025-06826-w. Epub 2025 Jul 15.
2
: A Reinforcement Learning Benchmark for Dynamic Treatment Regimes.动态治疗方案的强化学习基准
Adv Neural Inf Process Syst. 2024;37:130536-130568.
3
Offline reinforcement learning combining generalized advantage estimation and modality decomposition interaction.结合广义优势估计和模态分解交互的离线强化学习
Sci Rep. 2025 May 4;15(1):15601. doi: 10.1038/s41598-025-98572-1.
4
Data-driven energy management for electric vehicles using offline reinforcement learning.基于离线强化学习的电动汽车数据驱动能源管理
Nat Commun. 2025 Mar 22;16(1):2835. doi: 10.1038/s41467-025-58192-9.
5
Dynamic optimizers for complex industrial systems via direct data-driven synthesis.通过直接数据驱动合成实现复杂工业系统的动态优化器。
Commun Eng. 2025 Feb 17;4(1):25. doi: 10.1038/s44172-025-00368-8.
6
Offline prompt reinforcement learning method based on feature extraction.基于特征提取的离线提示强化学习方法
PeerJ Comput Sci. 2025 Jan 2;11:e2490. doi: 10.7717/peerj-cs.2490. eCollection 2025.
7
Individualized decision making in on-scene resuscitation time for out-of-hospital cardiac arrest using reinforcement learning.使用强化学习进行院外心脏骤停现场复苏时间的个体化决策
NPJ Digit Med. 2024 Oct 9;7(1):276. doi: 10.1038/s41746-024-01278-3.
8
Deep reinforcement learning for personalized treatment recommendation.深度强化学习在个性化治疗推荐中的应用。
Stat Med. 2022 Sep 10;41(20):4034-4056. doi: 10.1002/sim.9491. Epub 2022 Jun 18.