• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于积分二次约束的递归神经网络的鲁棒强化学习控制

Robust reinforcement learning control using integral quadratic constraints for recurrent neural networks.

作者信息

Anderson Charles W, Young Peter Michael, Buehner Michael R, Knight James N, Bush Keith A, Hittle Douglas C

机构信息

Department of Computer Science, Colorado State University, Fort Collins, CO 80523-1873, USA.

出版信息

IEEE Trans Neural Netw. 2007 Jul;18(4):993-1002. doi: 10.1109/TNN.2007.899520.

DOI:10.1109/TNN.2007.899520
PMID:17668656
Abstract

The applicability of machine learning techniques for feedback control systems is limited by a lack of stability guarantees. Robust control theory offers a framework for analyzing the stability of feedback control loops, but for the integral quadratic constraint (IQC) framework used here, all components are required to be represented as linear, time-invariant systems plus uncertainties with, for IQCs used here, bounded gain. In this paper, the stability of a control loop including a recurrent neural network (NN) is analyzed by replacing the nonlinear and time-varying components of the NN with IQCs on their gain. As a result, a range of the NN's weights is found within which stability is guaranteed. An algorithm is demonstrated for training the recurrent NN using reinforcement learning and guaranteeing stability while learning.

摘要

机器学习技术在反馈控制系统中的适用性因缺乏稳定性保证而受到限制。鲁棒控制理论提供了一个分析反馈控制回路稳定性的框架,但对于此处使用的积分二次约束(IQC)框架,所有组件都必须表示为线性、时不变系统加上不确定性,对于此处使用的IQC,增益是有界的。在本文中,通过用IQC对其增益进行替换来分析包含递归神经网络(NN)的控制回路的稳定性。结果,找到了一系列能保证稳定性的NN权重范围。展示了一种使用强化学习训练递归NN并在学习过程中保证稳定性的算法。

相似文献

1
Robust reinforcement learning control using integral quadratic constraints for recurrent neural networks.基于积分二次约束的递归神经网络的鲁棒强化学习控制
IEEE Trans Neural Netw. 2007 Jul;18(4):993-1002. doi: 10.1109/TNN.2007.899520.
2
Kernel-based least squares policy iteration for reinforcement learning.用于强化学习的基于核的最小二乘策略迭代
IEEE Trans Neural Netw. 2007 Jul;18(4):973-92. doi: 10.1109/TNN.2007.899161.
3
Reinforcement-learning-based output-feedback control of nonstrict nonlinear discrete-time systems with application to engine emission control.基于强化学习的非严格非线性离散时间系统输出反馈控制及其在发动机排放控制中的应用
IEEE Trans Syst Man Cybern B Cybern. 2009 Oct;39(5):1162-79. doi: 10.1109/TSMCB.2009.2013272. Epub 2009 Mar 24.
4
Reinforcement learning neural-network-based controller for nonlinear discrete-time systems with input constraints.具有输入约束的非线性离散时间系统的基于强化学习神经网络的控制器
IEEE Trans Syst Man Cybern B Cybern. 2007 Apr;37(2):425-36. doi: 10.1109/tsmcb.2006.883869.
5
Control of nonaffine nonlinear discrete-time systems using reinforcement-learning-based linearly parameterized neural networks.基于强化学习的线性参数化神经网络对非仿射非线性离散时间系统的控制
IEEE Trans Syst Man Cybern B Cybern. 2008 Aug;38(4):994-1001. doi: 10.1109/TSMCB.2008.926607.
6
Discrete-time online learning control for a class of unknown nonaffine nonlinear systems using reinforcement learning.基于强化学习的一类未知非仿射非线性系统的离散时间在线学习控制。
Neural Netw. 2014 Jul;55:30-41. doi: 10.1016/j.neunet.2014.03.008. Epub 2014 Mar 28.
7
Discrete-time adaptive backstepping nonlinear control via high-order neural networks.基于高阶神经网络的离散时间自适应反步非线性控制
IEEE Trans Neural Netw. 2007 Jul;18(4):1185-95. doi: 10.1109/TNN.2007.899170.
8
Neural-network-based nonlinear adaptive dynamical decoupling control.基于神经网络的非线性自适应动态解耦控制
IEEE Trans Neural Netw. 2007 May;18(3):921-5. doi: 10.1109/TNN.2007.891588.
9
Robust Integral of Neural Network and Error Sign Control of MIMO Nonlinear Systems.神经网络鲁棒积分与 MIMO 非线性系统误差信号控制
IEEE Trans Neural Netw Learn Syst. 2015 Dec;26(12):3278-86. doi: 10.1109/TNNLS.2015.2470175. Epub 2015 Sep 1.
10
Autonomous reinforcement learning with experience replay.自主强化学习与经验回放。
Neural Netw. 2013 May;41:156-67. doi: 10.1016/j.neunet.2012.11.007. Epub 2012 Nov 29.

引用本文的文献

1
A Novel Approach to Condition Monitoring of the Cutting Process Using Recurrent Neural Networks.基于循环神经网络的切削过程状态监测新方法。
Sensors (Basel). 2020 Aug 11;20(16):4493. doi: 10.3390/s20164493.