Norouzi Armin, Shahpouri Saeid, Gordon David, Shahbakhti Mahdi, Koch Charles Robert
Department of Mechanical Engineering, University of Alberta, Edmonton, AB, Canada.
Proc Inst Mech Eng Part I J Syst Control Eng. 2023 Sep;237(8):1440-1453. doi: 10.1177/09596518231153445. Epub 2023 Feb 17.
A deep reinforcement learning application is investigated to control the emissions of a compression ignition diesel engine. The main purpose of this study is to reduce the engine-out nitrogen oxide emissions and to minimize fuel consumption while tracking a reference engine load. First, a physics-based engine simulation model is developed in GT-Power and calibrated using experimental data. Using this model and a GT-Power/Simulink co-simulation, a deep deterministic policy gradient is developed. To reduce the risk of an unwanted output, a safety filter is added to the deep reinforcement learning. Based on the simulation results, this filter has no effect on the final trained deep reinforcement learning; however, during the training process, it is crucial to enforce constraints on the controller output. The developed safe reinforcement learning is then compared with an iterative learning controller and a deep neural network-based nonlinear model predictive controller. This comparison shows that the safe reinforcement learning is capable of accurately tracking an arbitrary reference input while the iterative learning controller is limited to a repetitive reference. The comparison between the nonlinear model predictive control and reinforcement learning indicates that for this case reinforcement learning is able to learn the optimal control output directly from the experiment without the need for a model. However, to enforce output constraint for safe learning reinforcement learning, a simple model of system is required. In this work, reinforcement learning was able to reduce emissions more than the nonlinear model predictive control; however, it suffered from slightly higher error in load tracking and a higher fuel consumption.
研究了一种深度强化学习应用,用于控制压缩点火式柴油发动机的排放。本研究的主要目的是在跟踪参考发动机负荷的同时,减少发动机尾气中的氮氧化物排放,并使燃料消耗最小化。首先,在GT-Power中开发了一个基于物理的发动机仿真模型,并使用实验数据进行校准。利用该模型和GT-Power/Simulink联合仿真,开发了深度确定性策略梯度。为了降低出现意外输出的风险,在深度强化学习中添加了一个安全滤波器。基于仿真结果,该滤波器对最终训练的深度强化学习没有影响;然而,在训练过程中,对控制器输出施加约束至关重要。然后将所开发的安全强化学习与迭代学习控制器和基于深度神经网络的非线性模型预测控制器进行比较。该比较表明,安全强化学习能够准确跟踪任意参考输入,而迭代学习控制器仅限于重复参考。非线性模型预测控制与强化学习之间的比较表明,对于这种情况,强化学习能够直接从实验中学习最优控制输出,而无需模型。然而,为了对安全学习强化学习施加输出约束,需要一个简单的系统模型。在这项工作中,强化学习比非线性模型预测控制能够更多地减少排放;然而,它在负荷跟踪方面存在稍高的误差,且燃料消耗更高。