Department of Civil Engineering, Isfahan University of Technology, Isfahan, Iran.
Department of Civil and Architectural Engineering, Sultan Qaboos University, Muscat, Oman.
J Environ Manage. 2024 Jun;362:121259. doi: 10.1016/j.jenvman.2024.121259. Epub 2024 Jun 3.
Machine learning methodology has recently been considered a smart and reliable way to monitor water quality parameters in aquatic environments like reservoirs and lakes. This study employs both individual and hybrid-based techniques to boost the accuracy of dissolved oxygen (DO) and chlorophyll-a (Chl-a) predictions in the Wadi Dayqah Dam located in Oman. At first, an AAQ-RINKO device (CTD sensor) was used to collect water quality parameters from different locations and varying depths in the reservoir. Second, the dataset is segmented into homogeneous clusters based on DO and Chl-a parameters by leveraging an optimized K-means algorithm, facilitating precise estimations. Third, ten sophisticated variational-individual data-driven models, namely generalized regression neural network (GRNN), random forest (RF), gaussian process regression (GPR), decision tree (DT), least-squares boosting (LSB), bayesian ridge (BR), support vector regression (SVR), K-nearest neighbors (KNN), multilayer perceptron (MLP), and group method of data handling (GMDH) are employed to estimate DO and Chl-a concentrations. Fourth, to improve prediction accuracy, bayesian model averaging (BMA), entropy weighted (EW), and a new enhanced clustering-based hybrid technique called Entropy-ORNESS are employed to combine model outputs. The Entropy-ORNESS method incorporates a Genetic Algorithm (GA) to determine optimal weights and then combine them with EW weights. Finally, the inclusion of bootstrapping techniques introduces a stochastic assessment of model uncertainty, resulting in a robust estimator model. In the validation phase, the Entropy-ORNESS technique outperforms the independent models among the three fusion-based methods, yielding R values of 0.92 and 0.89 for DO and Chl-a clusters, respectively. The proposed hybrid-based methodology demonstrates reduced uncertainty compared to single data-driven models and two combination frameworks, with uncertainty levels of 0.24% and 1.16% for cluster 1 of DO and cluster 2 of Chl-a parameters. As a highlight point, the spatial analysis of DO and Chl-a concentrations exhibit similar pattern variations across varying depths of the dam according to the comparison of field measurements with the best hybrid technique, in which DO concentration values notably decrease during warmer seasons. These findings collectively underscore the potential of the upgraded weighted-based hybrid approach to provide more accurate estimations of DO and Chl-a concentrations in dynamic aquatic environments.
机器学习方法最近被认为是一种监测水库和湖泊等水生环境中水质参数的智能且可靠的方法。本研究采用基于个体和混合的技术来提高阿曼 Wadi Dayqah 大坝中溶解氧 (DO) 和叶绿素-a (Chl-a) 预测的准确性。首先,使用 AAQ-RINKO 设备(CTD 传感器)从水库的不同位置和不同深度采集水质参数。其次,利用优化的 K-means 算法根据 DO 和 Chl-a 参数将数据集分割为同质簇,从而实现精确估计。第三,采用十种复杂的变分个体数据驱动模型,即广义回归神经网络 (GRNN)、随机森林 (RF)、高斯过程回归 (GPR)、决策树 (DT)、最小二乘提升 (LSB)、贝叶斯岭回归 (BR)、支持向量回归 (SVR)、K-最近邻 (KNN)、多层感知器 (MLP) 和数据处理组方法 (GMDH) 来估计 DO 和 Chl-a 浓度。第四,为了提高预测精度,采用贝叶斯模型平均 (BMA)、熵加权 (EW) 和一种新的基于增强聚类的混合技术,即熵 ORNESS,对模型输出进行组合。熵 ORNESS 方法结合遗传算法 (GA) 确定最优权重,然后与 EW 权重相结合。最后,引入自举技术对模型不确定性进行随机评估,得到稳健的估计模型。在验证阶段,熵 ORNESS 技术在三种融合方法中的独立模型中表现最佳,对 DO 和 Chl-a 簇的 R 值分别为 0.92 和 0.89。与单一数据驱动模型和两种组合框架相比,所提出的基于混合的方法表现出更低的不确定性,DO 簇 1 和 Chl-a 簇 2 的不确定性水平分别为 0.24%和 1.16%。作为一个亮点,根据与最佳混合技术的现场测量比较,DO 和 Chl-a 浓度的空间分析显示出不同深度的大坝中相似的模式变化,其中 DO 浓度值在温暖季节显著下降。这些发现共同强调了升级后的加权混合方法在动态水生环境中提供更准确的 DO 和 Chl-a 浓度估计的潜力。