Department of Civil Engineering and Water Conservancy, Shandong University, Jinan, 250061, China.
Water Resources Research Institute of Shandong Province, Jinan, 250014, China.
Environ Sci Pollut Res Int. 2024 Jan;31(1):262-279. doi: 10.1007/s11356-023-31148-6. Epub 2023 Nov 28.
The accurate and efficient prediction of chlorophyll-a (Chl-a) concentration is crucial for the early detection of algal blooms in reservoirs. Nevertheless, predicting Chl-a concentration in multivariate time series poses a significant challenge due to the complex interrelationships within the aquatic environment and the discrete and non-stationary nature of online monitoring of water quality data. To address the aforementioned issue, this paper proposes a novel prediction model named SGMD-KPCA-BiLSTM (SKB) for predicting Chl-a concentration. The model combines two-stage data processing and machine learning (ML). To capture nonlinear relationships in multivariate time series data, the optimal data subset is determined by combining symplectic geometry mode decomposition (SGMD) and kernel principal component analysis (KPCA). This subset is then input into a bidirectional long short-term memory (BiLSTM) model, and the model's hyperparameters are optimized using the sparrow search algorithm (SSA) to improve the accuracy of predictions. The performance of the model was evaluated at Qiaodian Reservoir in Shandong, China. To assess its superiority, the evaluation criteria included the root mean square error (RMSE), mean absolute percentage error (MAPE), mean absolute error (MAE), coefficient of determination (R), frequency histograms of the prediction error, and the Taylor diagram. The prediction performance of five single models, namely the back-propagation (BP) neural network, support vector regression (SVR), long short-term memory (LSTM), convolutional neural network with long short-term memory (CNN-LSTM), and BiLSTM, as well as three hybrid models, namely SGMD-LSTM, SGMD-KPCA-LSTM, and SGMD-BiLSTM, were compared against the SKB model. The results demonstrated that the SKB model performs best in predicting Chl-a concentration (R = 96.19%, RMSE = 1.05, MAE = 0.65, MAPE = 0.08). It significantly reduced the prediction error compared to other models for comparison. Furthermore, the multi-step predictive capabilities of the SKB model are also discussed. The analysis shows a decline in predictive performance with larger prediction time steps, and the SKB model exhibits slightly superior performance compared to the other model at corresponding prediction intervals. The model has significant advantages in terms of its ability to accurately predict the non-smooth and nonlinear Chl-a sequences observed by the online monitoring system. This study presents a potential solution for controlling and preventing reservoir eutrophication, as well as an innovative approach for predicting water quality.
叶绿素 a(Chl-a)浓度的准确、高效预测对水库藻类水华的早期检测至关重要。然而,由于水生环境内部复杂的相互关系以及水质在线监测数据的离散性和非平稳性,多元时间序列中 Chl-a 浓度的预测仍是一项重大挑战。为了解决上述问题,本文提出了一种新颖的预测模型,名为 SGMD-KPCA-BiLSTM(SKB),用于预测 Chl-a 浓度。该模型结合了两阶段数据处理和机器学习(ML)。为了捕捉多元时间序列数据中的非线性关系,通过结合辛几何模式分解(SGMD)和核主成分分析(KPCA)确定最优数据子集。然后将该子集输入到双向长短期记忆(BiLSTM)模型中,并使用麻雀搜索算法(SSA)优化模型的超参数,以提高预测的准确性。模型在山东桥店水库进行了性能评估。为了评估其优越性,评估标准包括均方根误差(RMSE)、平均绝对百分比误差(MAPE)、平均绝对误差(MAE)、决定系数(R)、预测误差的频率直方图和泰劳图。将五个单模型,即反向传播(BP)神经网络、支持向量回归(SVR)、长短期记忆(LSTM)、带长短期记忆的卷积神经网络(CNN-LSTM)和 BiLSTM,以及三个混合模型,即 SGMD-LSTM、SGMD-KPCA-LSTM 和 SGMD-BiLSTM 的预测性能与 SKB 模型进行了比较。结果表明,SKB 模型在预测 Chl-a 浓度方面表现最佳(R=96.19%,RMSE=1.05,MAE=0.65,MAPE=0.08)。与其他比较模型相比,它显著降低了预测误差。此外,还讨论了 SKB 模型的多步预测能力。分析表明,随着预测时间步长的增大,预测性能会下降,而 SKB 模型在相应的预测区间内比其他模型具有稍高的性能。该模型在准确预测在线监测系统观测到的非平滑和非线性 Chl-a 序列方面具有显著优势。本研究为控制和防止水库富营养化提供了一种潜在的解决方案,也是一种水质预测的创新方法。