Suppr超能文献

利用由Boruta-XGBoost特征选择算法增强的混合卷积神经网络-长短期记忆(CNN-LSTM)模型对河流电导率进行多步提前预测。

Multi-step ahead forecasting of electrical conductivity in rivers by using a hybrid Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM) model enhanced by Boruta-XGBoost feature selection algorithm.

作者信息

Karbasi Masoud, Ali Mumtaz, Bateni Sayed M, Jun Changhyun, Jamei Mehdi, Farooque Aitazaz Ahsan, Yaseen Zaher Mundher

机构信息

Water Engineering Department, Faculty of Agriculture, University of Zanjan, Zanjan, Iran.

UniSQ College, University of Southern Queensland, Springfield Campus, QLD, 4301, Australia.

出版信息

Sci Rep. 2024 Jul 1;14(1):15051. doi: 10.1038/s41598-024-65837-0.

Abstract

Electrical conductivity (EC) is widely recognized as one of the most essential water quality metrics for predicting salinity and mineralization. In the current research, the EC of two Australian rivers (Albert River and Barratta Creek) was forecasted for up to 10 days using a novel deep learning algorithm (Convolutional Neural Network combined with Long Short-Term Memory Model, CNN-LSTM). The Boruta-XGBoost feature selection method was used to determine the significant inputs (time series lagged data) to the model. To compare the performance of Boruta-XGB-CNN-LSTM models, three machine learning approaches-multi-layer perceptron neural network (MLP), K-nearest neighbour (KNN), and extreme gradient boosting (XGBoost) were used. Different statistical metrics, such as correlation coefficient (R), root mean square error (RMSE), and mean absolute percentage error, were used to assess the models' performance. From 10 years of data in both rivers, 7 years (2012-2018) were used as a training set, and 3 years (2019-2021) were used for testing the models. Application of the Boruta-XGB-CNN-LSTM model in forecasting one day ahead of EC showed that in both stations, Boruta-XGB-CNN-LSTM can forecast the EC parameter better than other machine learning models for the test dataset (R = 0.9429, RMSE = 45.6896, MAPE = 5.9749 for Albert River, and R = 0.9215, RMSE = 43.8315, MAPE = 7.6029 for Barratta Creek). Considering the better performance of the Boruta-XGB-CNN-LSTM model in both rivers, this model was used to forecast 3-10 days ahead of EC. The results showed that the Boruta-XGB-CNN-LSTM model is very capable of forecasting the EC for the next 10 days. The results showed that by increasing the forecasting horizon from 3 to 10 days, the performance of the Boruta-XGB-CNN-LSTM model slightly decreased. The results of this study show that the Boruta-XGB-CNN-LSTM model can be used as a good soft computing method for accurately predicting how the EC will change in rivers.

摘要

电导率(EC)被广泛认为是预测盐度和矿化度最重要的水质指标之一。在当前研究中,使用一种新型深度学习算法(卷积神经网络结合长短期记忆模型,CNN-LSTM)对澳大利亚两条河流(阿尔伯特河和巴拉塔溪)的电导率进行了长达10天的预测。采用Boruta-XGBoost特征选择方法来确定模型的重要输入(时间序列滞后数据)。为了比较Boruta-XGB-CNN-LSTM模型的性能,使用了三种机器学习方法——多层感知器神经网络(MLP)、K近邻(KNN)和极端梯度提升(XGBoost)。使用不同的统计指标,如相关系数(R)、均方根误差(RMSE)和平均绝对百分比误差,来评估模型的性能。从两条河流的10年数据中,7年(2012 - 2018年)用作训练集,3年(2019 - 2021年)用于测试模型。Boruta-XGB-CNN-LSTM模型在提前一天预测电导率方面的应用表明,在两个监测站,对于测试数据集,Boruta-XGB-CNN-LSTM在预测电导率参数方面比其他机器学习模型表现更好(阿尔伯特河:R = 0.9429,RMSE = 45.6896,MAPE = 5.9749;巴拉塔溪:R = 0.9215,RMSE = 43.8315,MAPE = 7.6029)。考虑到Boruta-XGB-CNN-LSTM模型在两条河流中的良好表现,该模型被用于提前3 - 10天预测电导率。结果表明,Boruta-XGB-CNN-LSTM模型非常能够预测未来10天的电导率。结果表明,通过将预测期从3天增加到10天,Boruta-XGB-CNN-LSTM模型的性能略有下降。本研究结果表明,Boruta-XGB-CNN-LSTM模型可作为一种良好的软计算方法,用于准确预测河流中电导率将如何变化。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef53/11217395/f6038a8667ff/41598_2024_65837_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验