Kherad Mahdi, Moayyedi Mohammad Kazem, Fotouhi-Ghazvini Faranak, Vahabi Maryam, Fotouhi Hossein
Department of Computer Engineering and IT, University of Qom, Qom 46611, Iran.
CFD and Turbulence Research Lab., Department of Mechanical Engineering, University of Qom, Qom 46611, Iran.
Sensors (Basel). 2025 Aug 19;25(16):5149. doi: 10.3390/s25165149.
In cyber-physical systems governed by nonlinear partial differential equations (PDEs), real-time control is often limited by sparse sensor data and high-dimensional system dynamics. Deep reinforcement learning (DRL) has shown promise for controlling such systems, but training DRL agents directly on full-order simulations is computationally intensive. This paper presents a sensor-driven, non-intrusive reduced-order modeling (NIROM) framework called FAE-CAE-LSTM, which combines convolutional and fully connected autoencoders with a long short-term memory (LSTM) network. The model compresses high-dimensional states into a latent space and captures their temporal evolution. A DRL agent is trained entirely in this reduced space, interacting with the surrogate built from sensor-like spatiotemporal measurements, such as pressure and velocity fields. A CNN-MLP reward estimator provides data-driven feedback without requiring access to governing equations. The method is tested on benchmark systems including Burgers' equation, the Kuramoto-Sivashinsky equation, and flow past a circular cylinder; accuracy is further validated on flow past a square cylinder. Experimental results show that the proposed approach achieves accurate reconstruction, robust control, and significant computational speedup over traditional simulation-based training. These findings confirm the effectiveness of the FAE-CAE-LSTM surrogate in enabling real-time, sensor-informed, scalable DRL-based control of nonlinear dynamical systems.
在由非线性偏微分方程(PDE)控制的网络物理系统中,实时控制常常受到稀疏传感器数据和高维系统动力学的限制。深度强化学习(DRL)已显示出控制此类系统的潜力,但直接在全阶模拟上训练DRL智能体计算量很大。本文提出了一种名为FAE-CAE-LSTM的传感器驱动的非侵入式降阶建模(NIROM)框架,该框架将卷积和全连接自动编码器与长短期记忆(LSTM)网络相结合。该模型将高维状态压缩到潜在空间并捕捉其时间演化。一个DRL智能体完全在这个降维空间中进行训练,与由类似传感器的时空测量(如压力和速度场)构建的代理模型进行交互。一个CNN-MLP奖励估计器提供数据驱动的反馈,而无需访问控制方程。该方法在包括伯格斯方程、Kuramoto-Sivashinsky方程以及绕圆柱流动等基准系统上进行了测试;在绕方柱流动上进一步验证了其准确性。实验结果表明,与传统的基于模拟的训练相比,所提出的方法实现了精确的重建、鲁棒的控制以及显著的计算加速。这些发现证实了FAE-CAE-LSTM代理模型在实现基于DRL的非线性动力系统实时、传感器驱动、可扩展控制方面的有效性。