Yuan Jiang, Dengxin Hua, Yufeng Wang, Xueting Yang, Huige Di, Qing Yan
Centre for Lidar Remote Sensing Research, Xi'an University of Technology, Xi'an, 710048, China.
Department of Physics and Electrical Engineering, Shaanxi University of Technology, Hanzhong, 723001, Shaanxi, China.
Sci Rep. 2025 Jul 1;15(1):21260. doi: 10.1038/s41598-025-05877-2.
Considering that ozone is essential to understanding air quality and climate change, this study presents a deep learning method for predicting atmospheric ozone concentrations. The method combines an attention mechanism with a convolutional neural network (CNN) and long short-term memory (LSTM) network to address the nonlinear nature of multivariate time-series data. It employs CNN and LSTM to extract features from short time series, enhanced by the attention mechanism to improved short-term prediction accuracy. It takes eight meteorological and environmental parameters from 16,806 records (2018-2019) as input, which are selected principal component analysis (PCA). It features an attention-based CNN-LSTM hybrid deep learning model with specific settings: a time step of 5, a batch size of 25, 15 units in the LSTM layer, the ReLU activation function, 25 epochs, and an overfitting avoidance strategy with a dropout rate of 0.15. Experimental results demonstrate that this hybrid model outperforms individual models and the CNN-LSTM model, especially in forward prediction with a multi-hour time lag. The model exhibits a high coefficient of determination (R = 0.971) and a root mean square error of 3.59 for a 1-hour time lag. It also exhibits consistent accuracy across different seasons, highlighting its robustness and superior time-series prediction capabilities for ozone concentrations.
鉴于臭氧对于理解空气质量和气候变化至关重要,本研究提出了一种用于预测大气臭氧浓度的深度学习方法。该方法将注意力机制与卷积神经网络(CNN)和长短期记忆(LSTM)网络相结合,以处理多变量时间序列数据的非线性特性。它采用CNN和LSTM从短时间序列中提取特征,并通过注意力机制增强,以提高短期预测精度。它将来自16806条记录(2018 - 2019年)的八个气象和环境参数作为输入,这些参数是通过主成分分析(PCA)选择的。它具有一个基于注意力的CNN - LSTM混合深度学习模型,其具体设置为:时间步长为5,批量大小为25,LSTM层中有15个单元,ReLU激活函数,25个轮次,以及一个辍学率为0.15的过拟合避免策略。实验结果表明,这种混合模型优于单个模型和CNN - LSTM模型,特别是在具有多小时时间滞后的向前预测中。对于1小时的时间滞后,该模型具有较高的决定系数(R = 0.971)和均方根误差3.59。它在不同季节也表现出一致的准确性,突出了其对臭氧浓度的稳健性和卓越的时间序列预测能力。