• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于Q学习的多变量非线性模型预测控制器:间歇式反应器温度轨迹跟踪的实验验证

Q‑Learning-Based Multivariate Nonlinear Model Predictive Controller: Experimental Validation on Batch Reactor for Temperature Trajectory Tracking.

作者信息

Vegesna Abhiram Varma, Shamaiah Narayanarao Muralikrishna, Bhamidipati Kishore, Indiran Thirunavukkarasu

机构信息

Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576 104, India.

Department of Instrumentation and Control Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576 104, India.

出版信息

ACS Omega. 2025 Jun 26;10(26):28362-28371. doi: 10.1021/acsomega.5c03482. eCollection 2025 Jul 8.

DOI:10.1021/acsomega.5c03482
PMID:40657105
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12242636/
Abstract

This study introduces a Q-learning-based nonlinear model predictive control (QL-NMPC) framework for temperature control in batch reactors. A reinforcement learning agent is trained in simulation to learn optimal control strategies using coolant flow rate and heater current as inputs. The resulting policy, represented as a Q-table, is implemented in real time on a physical reactor setup using the NVIDIA Jetson Orin platform. The proposed QL-NMPC framework employs a value iteration-based Q-learning algorithm, enabling model-free policy optimization without explicit policy evaluation steps, and demonstrates effective temperature tracking while highlighting the potential of reinforcement learning for controlling nonlinear batch processes without relying on system identification.

摘要

本研究介绍了一种基于Q学习的非线性模型预测控制(QL-NMPC)框架,用于间歇式反应器中的温度控制。在模拟中训练一个强化学习智能体,以使用冷却剂流速和加热器电流作为输入来学习最优控制策略。得到的策略以Q表的形式表示,在使用NVIDIA Jetson Orin平台的物理反应器装置上实时实施。所提出的QL-NMPC框架采用基于值迭代的Q学习算法,无需明确的策略评估步骤即可实现无模型策略优化,并展示了有效的温度跟踪,同时突出了强化学习在不依赖系统辨识的情况下控制非线性间歇过程的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/0127cd663608/ao5c03482_0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/ea8198000c03/ao5c03482_0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/25b73f6b842a/ao5c03482_0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/4fdcaf9c471d/ao5c03482_0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/a200a9aaf87e/ao5c03482_0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/de1511f3649b/ao5c03482_0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/33987070486c/ao5c03482_0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/97fc14bbf42f/ao5c03482_0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/01ef7dc2aab7/ao5c03482_0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/d9dd687f7a46/ao5c03482_0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/0127cd663608/ao5c03482_0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/ea8198000c03/ao5c03482_0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/25b73f6b842a/ao5c03482_0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/4fdcaf9c471d/ao5c03482_0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/a200a9aaf87e/ao5c03482_0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/de1511f3649b/ao5c03482_0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/33987070486c/ao5c03482_0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/97fc14bbf42f/ao5c03482_0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/01ef7dc2aab7/ao5c03482_0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/d9dd687f7a46/ao5c03482_0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0414/12242636/0127cd663608/ao5c03482_0010.jpg

相似文献

1
Q‑Learning-Based Multivariate Nonlinear Model Predictive Controller: Experimental Validation on Batch Reactor for Temperature Trajectory Tracking.基于Q学习的多变量非线性模型预测控制器:间歇式反应器温度轨迹跟踪的实验验证
ACS Omega. 2025 Jun 26;10(26):28362-28371. doi: 10.1021/acsomega.5c03482. eCollection 2025 Jul 8.
2
Reinforcement Learning-Based Nonlinear Model Predictive Controller for a Jacketed Reactor: A Machine Learning Concept Validation Using Jetson Orin.基于强化学习的夹套式反应器非线性模型预测控制器:使用Jetson Orin的机器学习概念验证
ACS Omega. 2025 Jul 9;10(28):30864-30878. doi: 10.1021/acsomega.5c03219. eCollection 2025 Jul 22.
3
Inverse RL Scene Dynamics Learning for Nonlinear Predictive Control in Autonomous Vehicles.用于自动驾驶车辆非线性预测控制的逆强化学习场景动力学学习
IEEE Trans Neural Netw Learn Syst. 2025 Aug;36(8):13754-13768. doi: 10.1109/TNNLS.2025.3549816.
4
Accelerated Value Iteration-Based Safe Q-Learning for Data-Driven Optimal Tracking Control.基于加速值迭代的安全Q学习用于数据驱动的最优跟踪控制
IEEE Trans Cybern. 2025 Jul;55(7):3511-3524. doi: 10.1109/TCYB.2025.3562172.
5
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
6
Adaptive Model Predictive Control for 4WD-4WS Mobile Robot: A Multivariate Gaussian Mixture Model-Ant Colony Optimization for Robust Trajectory Tracking and Obstacle Avoidance.四轮驱动-四轮转向移动机器人的自适应模型预测控制:用于鲁棒轨迹跟踪和避障的多元高斯混合模型-蚁群优化算法
Sensors (Basel). 2025 Jun 18;25(12):3805. doi: 10.3390/s25123805.
7
Privacy-Preserving Glycemic Management in Type 1 Diabetes: Development and Validation of a Multiobjective Federated Reinforcement Learning Framework.1型糖尿病中保护隐私的血糖管理:多目标联邦强化学习框架的开发与验证
JMIR Diabetes. 2025 Jul 4;10:e72874. doi: 10.2196/72874.
8
Deep Reinforcement Learning-Based Self-Optimization of Flow Chemistry.基于深度强化学习的流动化学自优化
ACS Eng Au. 2025 May 13;5(3):247-266. doi: 10.1021/acsengineeringau.5c00004. eCollection 2025 Jun 18.
9
Design of a novel and robust 2-DOF PIDA controller based on enzyme action optimizer for ball position regulation in magnetic levitation systems.基于酶作用优化器的新型鲁棒二自由度比例积分微分代数(PIDA)控制器在磁悬浮系统球位置调节中的设计
Sci Rep. 2025 Aug 11;15(1):29360. doi: 10.1038/s41598-025-13967-4.
10
PENC: a predictive-estimative nonlinear control framework for robust target tracking of fixed-wing UAVs in complex urban environments.PENC:一种用于复杂城市环境中固定翼无人机鲁棒目标跟踪的预测估计非线性控制框架。
Sci Rep. 2025 Aug 13;15(1):29753. doi: 10.1038/s41598-025-13095-z.

本文引用的文献

1
CNN-LSTM-Based Nonlinear Model Predictive Controller for Temperature Trajectory Tracking in a Batch Reactor.基于卷积神经网络-长短期记忆网络的非线性模型预测控制器用于间歇式反应器中的温度轨迹跟踪
ACS Omega. 2024 Nov 12;9(47):47203-47212. doi: 10.1021/acsomega.4c07893. eCollection 2024 Nov 26.
2
Machine Learning Based Fault Classification in Pilot Plant Batch Reactor: Using Support Vector Machine.基于机器学习的中试间歇反应器故障分类:使用支持向量机
ACS Omega. 2024 Jun 19;9(26):29041-29052. doi: 10.1021/acsomega.4c04421. eCollection 2024 Jul 2.
3
Development and Validation of Advanced Nonlinear Predictive Control Algorithms for Trajectory Tracking in Batch Polymerization.
用于间歇聚合过程轨迹跟踪的先进非线性预测控制算法的开发与验证
ACS Omega. 2021 Aug 26;6(35):22857-22865. doi: 10.1021/acsomega.1c03386. eCollection 2021 Sep 7.
4
Model-Free Optimal Tracking Control via Critic-Only Q-Learning.基于仅评价器 Q 学习的无模型最优跟踪控制。
IEEE Trans Neural Netw Learn Syst. 2016 Oct;27(10):2134-44. doi: 10.1109/TNNLS.2016.2585520. Epub 2016 Jul 12.
5
Discrete-Time Deterministic $Q$ -Learning: A Novel Convergence Analysis.离散时间确定性 Q 学习:一种新的收敛性分析。
IEEE Trans Cybern. 2017 May;47(5):1224-1237. doi: 10.1109/TCYB.2016.2542923. Epub 2016 Apr 11.
6
Fidelity-based probabilistic Q-learning for control of quantum systems.基于保真度的概率量子 Q 学习控制量子系统。
IEEE Trans Neural Netw Learn Syst. 2014 May;25(5):920-33. doi: 10.1109/TNNLS.2013.2283574.