Suppr超能文献

连续时间 Q 学习算法在无限时域折扣成本线性二次调节器问题中的应用。

Continuous-time Q-learning for infinite-horizon discounted cost linear quadratic regulator problems.

出版信息

IEEE Trans Cybern. 2015 Feb;45(2):165-76. doi: 10.1109/TCYB.2014.2322116. Epub 2014 May 29.

Abstract

This paper presents a method of Q-learning to solve the discounted linear quadratic regulator (LQR) problem for continuous-time (CT) continuous-state systems. Most available methods in the existing literature for CT systems to solve the LQR problem generally need partial or complete knowledge of the system dynamics. Q-learning is effective for unknown dynamical systems, but has generally been well understood only for discrete-time systems. The contribution of this paper is to present a Q-learning methodology for CT systems which solves the LQR problem without having any knowledge of the system dynamics. A natural and rigorous justified parameterization of the Q-function is given in terms of the state, the control input, and its derivatives. This parameterization allows the implementation of an online Q-learning algorithm for CT systems. The simulation results supporting the theoretical development are also presented.

摘要

本文提出了一种 Q-learning 方法,用于解决连续时间(CT)连续状态系统的折扣线性二次调节器(LQR)问题。现有文献中大多数用于 CT 系统解决 LQR 问题的方法通常需要系统动态的部分或全部知识。Q-learning 对于未知动态系统是有效的,但通常仅对离散时间系统有很好的理解。本文的贡献在于提出了一种用于 CT 系统的 Q-learning 方法,该方法可以在不了解系统动态的情况下解决 LQR 问题。以状态、控制输入及其导数的形式,给出了 Q 函数的自然和严格合理的参数化。该参数化允许为 CT 系统实现在线 Q-learning 算法。还提出了支持理论发展的仿真结果。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验