Queeney James, Paschalidis Ioannis Ch, Cassandras Christos G
Mitsubishi Electric Research Laboratories, Cambridge, MA 02139 USA. He performed the majority of this work while with the Division of Systems Engineering, Boston University, Boston, MA 02215 USA.
Department of Electrical and Computer Engineering and Division of Systems Engineering, Boston University, Boston, MA 02215 USA.
IEEE Trans Automat Contr. 2025 Feb;70(2):1236-1243. doi: 10.1109/tac.2024.3454011. Epub 2024 Sep 3.
We develop a new class of model-free deep reinforcement learning algorithms for data-driven, learning-based control. Our Generalized Policy Improvement algorithms combine the policy improvement guarantees of on-policy methods with the efficiency of sample reuse, addressing a trade-off between two important deployment requirements for real-world control: (i) practical performance guarantees and (ii) data efficiency. We demonstrate the benefits of this new class of algorithms through extensive experimental analysis on a broad range of simulated control tasks.
我们开发了一类全新的无模型深度强化学习算法,用于数据驱动的、基于学习的控制。我们的广义策略改进算法将基于策略方法的策略改进保证与样本重用的效率相结合,解决了现实世界控制中两个重要部署要求之间的权衡:(i)实际性能保证和(ii)数据效率。我们通过对广泛的模拟控制任务进行广泛的实验分析,证明了这类新算法的优势。