College of Locomotive and Rolling Stock Engineering, Dalian Jiaotong University, Dalian 116000, China.
CRRC Nanjing Puzhen Co., Ltd., Nanjing 210031, China.
Rev Sci Instrum. 2024 Sep 1;95(9). doi: 10.1063/5.0225277.
In contemporary industrial systems, the prediction of remaining useful life (RUL) is recognized as a valuable maintenance strategy for health management due to its ability to monitor equipment operational status in real time and ensure the safety of industrial production. Current studies have largely concentrated on deep learning (DL) techniques, leading to a shortage of RUL prediction methods that utilize deep reinforcement learning (DRL). To further enhance application and research, this paper introduces a novel approach to RUL prediction based on DRL, specifically using a combination of Convolutional Neural Network-Bidirectional Long Short-Term Memory Network (CNN-BiLSTM) and the Deep Deterministic Policy Gradient (DDPG) algorithm. The proposed method reframes the conventional task of estimating RUL as a Markov decision process (MDP), effectively integrating the feature extraction capabilities of DL with the decision-making abilities of DRL. Initially, a hybrid CNN-BiLSTM is employed to establish an agent that can extract degradation features from raw signals. Subsequently, the DDPG algorithm within DRL is leveraged to develop the RUL prediction mechanism, completing the MDP by defining appropriate action spaces and reward functions. The agent, through repeated trials and optimization, learns to map the current operational state of the rolling bearing to its remaining service life. Validation analysis was performed on the intelligent maintenance systems (IMS) bearing dataset. The findings suggest that the DRL-based approach outperforms the current methodologies, demonstrating a superior performance in root mean square error (MSE) and MSE metrics. The predicted outcomes align more closely with the actual lifespan values.
在当代工业系统中,由于能够实时监测设备运行状态并确保工业生产安全,预测剩余使用寿命(RUL)被认为是一种有价值的健康管理维护策略。目前的研究主要集中在深度学习(DL)技术上,导致缺乏利用深度强化学习(DRL)的 RUL 预测方法。为了进一步增强应用和研究,本文提出了一种基于 DRL 的 RUL 预测新方法,特别是使用卷积神经网络-双向长短期记忆网络(CNN-BiLSTM)和深度确定性策略梯度(DDPG)算法的组合。该方法将传统的 RUL 估计任务重新定义为马尔可夫决策过程(MDP),有效地将 DL 的特征提取能力与 DRL 的决策能力结合起来。首先,采用混合 CNN-BiLSTM 来建立一个能够从原始信号中提取退化特征的代理。然后,利用 DRL 中的 DDPG 算法来开发 RUL 预测机制,通过定义适当的动作空间和奖励函数来完成 MDP。代理通过反复试验和优化,学会将滚动轴承的当前运行状态映射到其剩余使用寿命。在智能维护系统(IMS)轴承数据集上进行了验证分析。结果表明,基于 DRL 的方法优于当前的方法,在均方根误差(MSE)和 MSE 指标方面表现出更好的性能。预测结果与实际寿命值更接近。