基于动作依赖启发式动态规划的具有积分作用的在线离散时间LQR控制器设计，用于斗轮堆取料机的运行过程

Online discrete-time LQR controller design with integral action for bulk Bucket Wheel Reclaimer operational processes via Action-Dependent Heuristic Dynamic Programming.

作者信息

de Moura José Pinheiro, Rego Patrícia Helena Moraes, da Fonseca Neto João Viana

机构信息

UEMA, Brazil.

出版信息

ISA Trans. 2019 Jul;90:294-310. doi: 10.1016/j.isatra.2019.01.010. Epub 2019 Jan 30.

DOI:10.1016/j.isatra.2019.01.010

PMID:30732992

Abstract

In this paper, a novel approach for online design of optimal control systems applied to the bulk resumption process by bucket wheel reclaimer (BWR) is presented. This approach is based on reinforcement learning paradigms, more specifically Action Dependent Heuristic Dynamic Programming (ADHDP), that learn online in real-time the Discrete Linear Quadratic Regulator (DLQR) optimal control solution with integral action. Due to the geometric irregularities of the storage yard stacks and variation in physical and chemical characteristics of the stacked material, the flow control of solid bulks by bucket wheel reclaimer requires methods that are suitable with the high degree of imprecision of process variables and environment uncertainties. The resumption of bulk solids is carried out by dividing the stack into layers, each layer is approximately 4 m high, and the layers are divided into workbenches up to 12 m in length. To take up a workbench several translation steps are required (penetration in the stack), with the translation step varying from 0 to 1 m. In order to maintain the desired ore flow throughout the process, the BWR lance speed must be periodically adjusted. The main advantage of the proposed control method is that besides the decision rule is fully independent of plant model, the gains of the resulting controller are self-adjustable. The control system was designed in such a way that the ADHDP-based DLQR controller with integral action would act in real-time in the plant control, using only the input and output signals and states measured along the system trajectory.

摘要

本文提出了一种应用于斗轮堆取料机（BWR）散料恢复过程的最优控制系统在线设计新方法。该方法基于强化学习范式，更具体地说是基于动作相关启发式动态规划（ADHDP），它能实时在线学习具有积分作用的离散线性二次调节器（DLQR）最优控制解。由于堆场料堆的几何不规则性以及堆存物料物理和化学特性的变化，斗轮堆取料机对固体散料的流量控制需要适合过程变量高度不精确性和环境不确定性的方法。散料的恢复是通过将料堆分层进行的，每层大约4米高，并且这些层被划分成长度达12米的工作台。为了占据一个工作台需要几个平移步骤（深入料堆），平移步长从0到1米不等。为了在整个过程中保持所需的矿石流量，必须定期调整BWR喷枪速度。所提出的控制方法的主要优点是，除了决策规则完全独立于工厂模型外，所得控制器的增益是可自我调整的。控制系统的设计方式是，基于ADHDP的具有积分作用的DLQR控制器将仅使用沿系统轨迹测量的输入、输出信号和状态在工厂控制中实时起作用。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于动作依赖启发式动态规划的具有积分作用的在线离散时间LQR控制器设计，用于斗轮堆取料机的运行过程

Online discrete-time LQR controller design with integral action for bulk Bucket Wheel Reclaimer operational processes via Action-Dependent Heuristic Dynamic Programming.

作者信息

机构信息

出版信息

相似文献

基于动作依赖启发式动态规划的具有积分作用的在线离散时间LQR控制器设计，用于斗轮堆取料机的运行过程

Online discrete-time LQR controller design with integral action for bulk Bucket Wheel Reclaimer operational processes via Action-Dependent Heuristic Dynamic Programming.

作者信息

机构信息

出版信息

相似文献