Hassan Ahmed M, Ababneh Jafar, Attar Hani, Shamseldin Tamer, Abdelbaset Ahmed, Metwally Mohamed Eladly
Department of Electrical Power and Machines Engineering, Faculty of Engineering, Benha University, Shoubra, Cairo, Egypt.
Department of Electrical Power and Machines Engineering, Higher Institute of Engineering (HIE), El-Shorouk Academy, El-Shorouk City, Egypt.
PLoS One. 2025 Jan 3;20(1):e0316326. doi: 10.1371/journal.pone.0316326. eCollection 2025.
Enhancing the performance of 5ph-IPMSM control plays a crucial role in advancing various innovative applications such as electric vehicles. This paper proposes a new reinforcement learning (RL) control algorithm based twin-delayed deep deterministic policy gradient (TD3) algorithm to tune two cascaded PI controllers in a five-phase interior permanent magnet synchronous motor (5ph-IPMSM) drive system based model predictive control (MPC). The main purpose of the control methodology is to optimize the 5ph-IPMSM speed response either in constant torque region or constant power region. The speed responses obtained using RL control algorithm are compared with those obtained using four of the most recent metaheuristic optimization techniques (MHOT) which are Transit Search (TS), Honey Badger Algorithm (HBA), Dwarf Mongoose (DM), and Dandelion-Optimizer (DO) optimization techniques. The speed response are compared in terms of the settling time, rise time, maximum time and maximum overshoot percentage. It is found that the suggested RL based TD3 give minimum settling time and relatively low values for the rise time, max time and overshoot percentage which makes the RL provide superior speed responses compared with those obtained from the four MHOT. The drive system speed responses are obtained in the constant torque region and constant power region using MATLAB SIMULINK package.
提高五相内置式永磁同步电机(5ph-IPMSM)的控制性能对于推进电动汽车等各种创新应用起着至关重要的作用。本文提出了一种基于强化学习(RL)控制算法的双延迟深度确定性策略梯度(TD3)算法,用于在基于模型预测控制(MPC)的五相内置式永磁同步电机(5ph-IPMSM)驱动系统中调整两个级联的PI控制器。该控制方法的主要目的是在恒转矩区域或恒功率区域优化5ph-IPMSM的速度响应。将使用RL控制算法获得的速度响应与使用四种最新的元启发式优化技术(MHOT)获得的速度响应进行比较,这四种技术分别是跃迁搜索(TS)、蜜獾算法(HBA)、矮猫鼬算法(DM)和蒲公英优化器(DO)优化技术。从调节时间、上升时间、最大时间和最大超调百分比方面对速度响应进行比较。结果发现,所提出的基于RL的TD3算法具有最短的调节时间,上升时间、最大时间和超调百分比的值也相对较低,这使得RL算法与从四种MHOT获得的速度响应相比具有更好的速度响应。使用MATLAB SIMULINK软件包在恒转矩区域和恒功率区域获得驱动系统的速度响应。