• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于强化学习的扭摆装置近似神经最优控制。

Approximate neural optimal control with reinforcement learning for a torsional pendulum device.

机构信息

Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China.

出版信息

Neural Netw. 2019 Sep;117:1-7. doi: 10.1016/j.neunet.2019.04.026. Epub 2019 May 23.

DOI:10.1016/j.neunet.2019.04.026
PMID:31129489
Abstract

A torsional pendulum device containing hyperbolic tangent input nonlinearities can be formulated as a nonaffine system. Unlike basic affine systems, the optimal feedback control of complex nonaffine plants is difficult but quite important. In this paper, the approximate optimal control design of continuous-time nonaffine nonlinear systems is investigated with the help of reinforcement learning. For addressing the learning algorithm conveniently, an effective pre-compensation technique is adopted to perform proper system transformation. Then, the integral policy iteration strategy is incorporated to relieve the demand of system dynamics. Moreover, the actor-critic structure is implemented by virtue of neural network approximators. Finally, the experimental verification for the proposed torsional pendulum plant is conducted after a learning process of 20 iterations and the stability performance with basic robustness guarantee can be observed during two case studies.

摘要

一个含有双曲正切输入非线性的扭转摆装置可以被表述为非仿射系统。与基本的仿射系统不同,复杂的非仿射植物的最优反馈控制是困难的,但却非常重要。本文借助强化学习研究了连续时间非仿射非线性系统的近似最优控制设计。为了方便地解决学习算法的问题,采用了一种有效的预补偿技术来进行适当的系统变换。然后,采用积分策略迭代策略来减轻对系统动态性的需求。此外,通过神经网络逼近器实现了演员-评论家结构。最后,在经过 20 次迭代的学习过程后,对所提出的扭转摆装置进行了实验验证,可以观察到在两个案例研究中具有基本鲁棒性保证的稳定性性能。

相似文献

1
Approximate neural optimal control with reinforcement learning for a torsional pendulum device.基于强化学习的扭摆装置近似神经最优控制。
Neural Netw. 2019 Sep;117:1-7. doi: 10.1016/j.neunet.2019.04.026. Epub 2019 May 23.
2
Adaptive nearly optimal control for a class of continuous-time nonaffine nonlinear systems with inequality constraints.一类具有不等式约束的连续时间非仿射非线性系统的自适应近乎最优控制
ISA Trans. 2017 Jan;66:122-133. doi: 10.1016/j.isatra.2016.10.019. Epub 2016 Nov 9.
3
Self-learning robust optimal control for continuous-time nonlinear systems with mismatched disturbances.自学习鲁棒最优控制用于具有失配干扰的连续时间非线性系统。
Neural Netw. 2018 Mar;99:19-30. doi: 10.1016/j.neunet.2017.11.022. Epub 2017 Dec 13.
4
Adaptive Reinforcement Learning Control Based on Neural Approximation for Nonlinear Discrete-Time Systems With Unknown Nonaffine Dead-Zone Input.基于神经逼近的具有未知非仿射死区输入的非线性离散时间系统的自适应强化学习控制
IEEE Trans Neural Netw Learn Syst. 2019 Jan;30(1):295-305. doi: 10.1109/TNNLS.2018.2844165. Epub 2018 Jun 28.
5
Integral reinforcement learning based event-triggered control with input saturation.基于积分强化学习的事件触发控制与输入饱和。
Neural Netw. 2020 Nov;131:144-153. doi: 10.1016/j.neunet.2020.07.016. Epub 2020 Jul 30.
6
A policy iteration approach to online optimal control of continuous-time constrained-input systems.一种连续时间约束输入系统在线最优控制的策略迭代方法。
ISA Trans. 2013 Sep;52(5):611-21. doi: 10.1016/j.isatra.2013.04.004. Epub 2013 May 24.
7
Discrete-time online learning control for a class of unknown nonaffine nonlinear systems using reinforcement learning.基于强化学习的一类未知非仿射非线性系统的离散时间在线学习控制。
Neural Netw. 2014 Jul;55:30-41. doi: 10.1016/j.neunet.2014.03.008. Epub 2014 Mar 28.
8
Reinforcement learning controller design for affine nonlinear discrete-time systems using online approximators.基于在线逼近器的仿射非线性离散时间系统强化学习控制器设计
IEEE Trans Syst Man Cybern B Cybern. 2012 Apr;42(2):377-90. doi: 10.1109/TSMCB.2011.2166384. Epub 2011 Sep 23.
9
Control of nonaffine nonlinear discrete-time systems using reinforcement-learning-based linearly parameterized neural networks.基于强化学习的线性参数化神经网络对非仿射非线性离散时间系统的控制
IEEE Trans Syst Man Cybern B Cybern. 2008 Aug;38(4):994-1001. doi: 10.1109/TSMCB.2008.926607.
10
Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks.基于策略迭代和神经网络的未知约束输入系统自适应最优控制。
IEEE Trans Neural Netw Learn Syst. 2013 Oct;24(10):1513-25. doi: 10.1109/TNNLS.2013.2276571.