Suppr超能文献

安全强化学习综述:方法、理论与应用

A Review of Safe Reinforcement Learning: Methods, Theories, and Applications.

作者信息

Gu Shangding, Yang Long, Du Yali, Chen Guang, Walter Florian, Wang Jun, Knoll Alois

出版信息

IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):11216-11235. doi: 10.1109/TPAMI.2024.3457538. Epub 2024 Nov 6.

Abstract

Reinforcement Learning (RL) has achieved tremendous success in many complex decision-making tasks. However, safety concerns are raised during deploying RL in real-world applications, leading to a growing demand for safe RL algorithms, such as in autonomous driving and robotics scenarios. While safe control has a long history, the study of safe RL algorithms is still in the early stages. To establish a good foundation for future safe RL research, in this paper, we provide a review of safe RL from the perspectives of methods, theories, and applications. First, we review the progress of safe RL from five dimensions and come up with five crucial problems for safe RL being deployed in real-world applications, coined as "2H3W". Second, we analyze the algorithm and theory progress from the perspectives of answering the "2H3W" problems. Particularly, the sample complexity of safe RL algorithms is reviewed and discussed, followed by an introduction to the applications and benchmarks of safe RL algorithms. Finally, we open the discussion of the challenging problems in safe RL, hoping to inspire future research on this thread. To advance the study of safe RL algorithms, we release an open-sourced repository containing major safe RL algorithms at the link.

摘要

强化学习(RL)在许多复杂决策任务中取得了巨大成功。然而,在将强化学习应用于实际场景时,安全问题引发了人们的关注,这导致对安全强化学习算法的需求不断增长,例如在自动驾驶和机器人场景中。虽然安全控制有着悠久的历史,但安全强化学习算法的研究仍处于早期阶段。为了为未来的安全强化学习研究奠定良好基础,在本文中,我们从方法、理论和应用的角度对安全强化学习进行了综述。首先,我们从五个维度回顾了安全强化学习的进展,并提出了安全强化学习在实际应用中面临的五个关键问题,简称为“2H3W”。其次,我们从回答“2H3W”问题的角度分析了算法和理论进展。特别地,我们回顾并讨论了安全强化学习算法的样本复杂性,随后介绍了安全强化学习算法的应用和基准测试。最后,我们开启了对安全强化学习中具有挑战性问题的讨论,希望能激发关于这一主题的未来研究。为了推进安全强化学习算法的研究,我们在链接处发布了一个包含主要安全强化学习算法的开源代码库。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验