基于Q学习的多变量非线性模型预测控制器：间歇式反应器温度轨迹跟踪的实验验证

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Vegesna Abhiram Varma, Shamaiah Narayanarao Muralikrishna, Bhamidipati Kishore, Indiran Thirunavukkarasu

Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576 104, India.

Department of Instrumentation and Control Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576 104, India.

ACS Omega. 2025 Jun 26;10(26):28362-28371. doi: 10.1021/acsomega.5c03482. eCollection 2025 Jul 8.

This study introduces a Q-learning-based nonlinear model predictive control (QL-NMPC) framework for temperature control in batch reactors. A reinforcement learning agent is trained in simulation to learn optimal control strategies using coolant flow rate and heater current as inputs. The resulting policy, represented as a Q-table, is implemented in real time on a physical reactor setup using the NVIDIA Jetson Orin platform. The proposed QL-NMPC framework employs a value iteration-based Q-learning algorithm, enabling model-free policy optimization without explicit policy evaluation steps, and demonstrates effective temperature tracking while highlighting the potential of reinforcement learning for controlling nonlinear batch processes without relying on system identification.

本研究介绍了一种基于Q学习的非线性模型预测控制（QL-NMPC）框架，用于间歇式反应器中的温度控制。在模拟中训练一个强化学习智能体，以使用冷却剂流速和加热器电流作为输入来学习最优控制策略。得到的策略以Q表的形式表示，在使用NVIDIA Jetson Orin平台的物理反应器装置上实时实施。所提出的QL-NMPC框架采用基于值迭代的Q学习算法，无需明确的策略评估步骤即可实现无模型策略优化，并展示了有效的温度跟踪，同时突出了强化学习在不依赖系统辨识的情况下控制非线性间歇过程的潜力。