Department of Smart Cities, Chung-Ang University, Seoul, Republic of Korea.
State Key Laboratory of Water Resources and Hydropower Engineering Science, Wuhan University, Wuhan, 430072, People's Republic of China.
Sci Rep. 2022 Nov 16;12(1):19717. doi: 10.1038/s41598-022-23436-x.
Dry days at varied scale are an important topic in climate discussions. Prolonged dry days define a dry period. Dry days with a specific rainfall threshold may visualize a climate scenario of a locality. The variation of monthly dry days from station to station could be correlated with several climatic factors. This study suggests a novel approach for predicting monthly dry days (MDD) of six target stations using different machine learning (ML) algorithms in Bangladesh. Several rainfall thresholds were used to prepare the datasets of monthly dry days (MDD) and monthly wet days (MWD). A group of ML algorithms, like Bagged Trees (BT), Exponential Gaussian Process Regression (EGPR), Matern Gaussian Process Regression (MGPR), Linear Support Vector Machine (LSVM), Fine Trees (FT) and Linear Regression (LR) were evaluated on building a competitive prediction model of MDD. In validation of the study, EGPR-based models were able to better capture the monthly dry days (MDD) over Bangladesh compared to those by MGPR, LSVM, BT, LR and FT-based models. When MDD were the predictors for all six target stations, EGPR produced highest mean R of 0.91 (min. 0.89 and max. 0.92) with a least mean RMSE of 2.14 (min. 1.78 and max. 2.69) compared to other models. An explicit evaluation of the ML algorithms using one-year lead time approach demonstrated that BT and EGPR were the most result-oriented algorithms (R = 0.78 for both models). However, having a least RMSE, EGPR was chosen as the best model in one year lead time. The dataset of monthly dry-wet days was the best predictor in the lead-time approach. In addition, sensitivity analysis demonstrated sensitivity of each station on the prediction of MDD of target stations. Monte Carlo simulation was introduced to assess the robustness of the developed models. EGPR model declared its robustness up to certain limit of randomness on the testing data. The output of this study can be referred to the agricultural sector to mitigate the impacts of dry spells on agriculture.
在气候讨论中,不同规模的干燥天数是一个重要的话题。持续干燥的天数定义了一个干燥期。具有特定降雨量阈值的干燥天数可能会直观地显示出一个地方的气候情景。从一个站点到另一个站点的每月干燥天数的变化可以与几个气候因素相关联。本研究提出了一种新的方法,使用不同的机器学习 (ML) 算法来预测孟加拉国六个目标站的每月干燥天数 (MDD)。使用了几种降雨阈值来准备每月干燥天数 (MDD) 和每月湿润天数 (MWD) 的数据集。一组 ML 算法,如袋装树 (BT)、指数高斯过程回归 (EGPR)、母体高斯过程回归 (MGPR)、线性支持向量机 (LSVM)、细树 (FT) 和线性回归 (LR),用于建立 MDD 的竞争预测模型。在研究验证中,基于 EGPR 的模型能够更好地捕捉孟加拉国的每月干燥天数 (MDD),而基于 MGPR、LSVM、BT、LR 和 FT 的模型则无法。当 MDD 成为所有六个目标站的预测因子时,EGPR 产生的最高平均 R 为 0.91(最小 0.89,最大 0.92),最小平均 RMSE 为 2.14(最小 1.78,最大 2.69),优于其他模型。使用一年提前时间方法对 ML 算法进行了明确评估,结果表明 BT 和 EGPR 是最注重结果的算法(两个模型的 R 均为 0.78)。然而,由于具有最小的 RMSE,EGPR 被选为一年提前时间的最佳模型。每月干湿天数数据集是提前时间方法中最好的预测因子。此外,敏感性分析表明,每个站对目标站 MDD 的预测都具有敏感性。引入了蒙特卡罗模拟来评估所开发模型的稳健性。EGPR 模型在测试数据的一定随机性限制内宣布其稳健性。本研究的结果可参考农业部门,以减轻干旱对农业的影响。