基于随机森林（RF）、随机树（RT）和高斯过程回归（GPR）模型的气象干旱预测及标准化降水指数

Prediction of meteorological drought and standardized precipitation index based on the random forest (RF), random tree (RT), and Gaussian process regression (GPR) models.

作者信息

Elbeltagi Ahmed, Pande Chaitanya B, Kumar Manish, Tolche Abebe Debele, Singh Sudhir Kumar, Kumar Akshay, Vishwakarma Dinesh Kumar

机构信息

Agricultural Engineering Department, Faculty of Agriculture, Mansoura University, Mansoura, 35516, Egypt.

Indian Institute of Tropical Meteorology, Pune, India.

出版信息

Environ Sci Pollut Res Int. 2023 Mar;30(15):43183-43202. doi: 10.1007/s11356-023-25221-3. Epub 2023 Jan 17.

DOI:10.1007/s11356-023-25221-3

PMID:36648725

Abstract

Agriculture, meteorological, and hydrological drought is a natural hazard which affects ecosystems in the central India of Maharashtra state. Due to limited historical data for drought monitoring and forecasting available in the central India of Maharashtra state, implementing machine learning (ML) algorithms could allow for the prediction of future drought events. In this paper, we have focused on the prediction accuracy of meteorological drought in the semi-arid region based on the standardized precipitation index (SPI) using the random forest (RF), random tree (RT), and Gaussian process regression (GPR-PUK kernel) models. A different combination of machine learning models and variables has been performed for the forecasting of metrological drought based on the SPI-6 and 12 months. Models were developed using monthly rainfall data for the period of 2000-2019 at two meteorological stations, namely, Karanjali and Gangawdi, each representing a geographical region of Upper Godavari river basin area in the central India of Maharashtra state which frequently experiences droughts. Historical data from the SPI from 2000 to 2013 was processed to train the model into machine learning model, and the rest of the 2014 to 2019-year data were used for testing to forecast the SPI and metrological drought. The mean square error (MSE), root mean square error (RMSE), adjusted R, Mallows' (Cp), Akaike's (AIC), Schwarz's (SBC), and Amemiya's PC were used to identify the best combination input model and best subregression analysis for both stations of SPI-6 and 12. The correlation coefficient ([Formula: see text]), mean absolute error (MAE), root mean square error (RMSE), relative absolute error (RAE), and root relative squared error (RRSE) were used to perform evaluation for SPI-6 and 12 months of both stations with RF, RT, and GPR-PUK kernel models during the training and testing scenarios. The results during testing phase revealed that the RF was found as the best model in forecasting droughts with values of [Formula: see text], MAE, RMSE, RAE (%), and RRSE (%) being 0.856, 0.551, 0.718, 74.778, and 54.019, respectively, for SPI-6 while 0.961, 0.361, 0.538, 34.926, and 28.262, respectively, for SPI-12 scales at Gangawdi station. Further, the respective values of evaluators at Karanjali station were 0.913 and 0.966, 0.541 and 0.386, 0.604 and 0.589, 52.592 and 36.959, and 42.315 and 31.394 for PUK kernel and RT models, respectively, during SPI-6 and SPI-12. Machine learning models are potential drought warning techniques because they take less time, have fewer inputs, and are less sophisticated than dynamic or scientific models.

摘要

农业干旱、气象干旱和水文干旱是一种自然灾害，影响着印度中部马哈拉施特拉邦的生态系统。由于印度中部马哈拉施特拉邦可用于干旱监测和预测的历史数据有限，实施机器学习（ML）算法有助于预测未来的干旱事件。在本文中，我们重点研究了基于标准化降水指数（SPI），使用随机森林（RF）、随机树（RT）和高斯过程回归（GPR-PUK核）模型对半干旱地区气象干旱的预测准确性。基于SPI-6和12个月的数据，对机器学习模型和变量进行了不同组合，以预测气象干旱。利用2000-2019年期间两个气象站（即卡兰贾利和甘加迪）的月降雨量数据开发模型，这两个气象站分别代表印度中部马哈拉施特拉邦上戈达瓦里河流域地区的一个地理区域，该地区经常遭受干旱。处理2000年至2013年SPI的历史数据，将模型训练为机器学习模型，其余2014年至2019年的数据用于测试，以预测SPI和气象干旱。使用均方误差（MSE）、均方根误差（RMSE）、调整后的R、马洛斯（Cp）、赤池（AIC）、施瓦茨（SBC）和阿米米亚的PC来确定SPI-6和12两个站点的最佳组合输入模型和最佳子回归分析。相关系数（[公式：见正文]）、平均绝对误差（MAE）、均方根误差（RMSE）、相对绝对误差（RAE）和根相对平方误差（RRSE）用于在训练和测试场景中对两个站点SPI-6和12个月的RF、RT和GPR-PUK核模型进行评估。测试阶段的结果表明，RF被发现是预测干旱的最佳模型，对于SPI-6，其[公式：见正文]、MAE、RMSE、RAE（%）和RRSE（%）的值分别为0.856、0.551、0.718、74.778和54.019，而对于甘加迪站的SPI-12尺度，其值分别为0.961、0.361、0.538、34.926和28.262。此外，在SPI-6和SPI-12期间，卡兰贾利站PUK核模型和RT模型的评估器各自的值分别为0.913和0.966、0.541和0.386、0.604和0.589、52.592和36.959以及42.315和31.394。机器学习模型是潜在的干旱预警技术，因为它们耗时更少、输入更少，并且比动态模型或科学模型更简单。

相似文献

Prediction of meteorological drought and standardized precipitation index based on the random forest (RF), random tree (RT), and Gaussian process regression (GPR) models.

Environ Sci Pollut Res Int. 2023 Mar;30(15):43183-43202. doi: 10.1007/s11356-023-25221-3. Epub 2023 Jan 17.

Assessment of drought conditions and prediction by machine learning algorithms using Standardized Precipitation Index and Standardized Water-Level Index (case study: Yazd province, Iran).

Environ Sci Pollut Res Int. 2023 Sep;30(45):101744-101760. doi: 10.1007/s11356-023-29522-5. Epub 2023 Sep 1.

Forecasting standardized precipitation index using data intelligence models: regional investigation of Bangladesh.

Sci Rep. 2021 Feb 9;11(1):3435. doi: 10.1038/s41598-021-82977-9.

Fusion-based framework for meteorological drought modeling using remotely sensed datasets under climate change scenarios: Resilience, vulnerability, and frequency analysis.

J Environ Manage. 2021 Nov 1;297:113283. doi: 10.1016/j.jenvman.2021.113283. Epub 2021 Jul 16.

Comparison of hybrid machine learning methods for the prediction of short-term meteorological droughts of Sakarya Meteorological Station in Turkey.

Environ Sci Pollut Res Int. 2022 Oct;29(50):75487-75511. doi: 10.1007/s11356-022-21083-3. Epub 2022 Jun 3.

Extreme climate index estimation and projection in association with enviro-meteorological parameters using random forest-ARIMA hybrid model over the Vidarbha region, India.

Environ Monit Assess. 2023 Feb 9;195(3):380. doi: 10.1007/s10661-022-10902-2.

Regional frequency analysis of drought severity and duration in Karkheh River Basin, Iran using univariate L-moments method.

Environ Monit Assess. 2022 Apr 7;194(5):336. doi: 10.1007/s10661-022-09977-8.

An improved SPEI drought forecasting approach using the long short-term memory neural network.

J Environ Manage. 2021 Apr 1;283:111979. doi: 10.1016/j.jenvman.2021.111979. Epub 2021 Jan 19.

Drought index prediction using advanced fuzzy logic model: Regional case study over Kumaon in India.

PLoS One. 2020 May 21;15(5):e0233280. doi: 10.1371/journal.pone.0233280. eCollection 2020.

Spatio-temporal drought assessment of the Subarnarekha River basin, India, using CHIRPS-derived hydrometeorological indices.

Environ Monit Assess. 2022 Oct 17;194(12):902. doi: 10.1007/s10661-022-10547-1.

引用本文的文献

Decomposition-reconstruction-optimization framework for hog price forecasting: Integrating STL, PCA, and BWO-optimized BiLSTM.

PLoS One. 2025 Jun 27;20(6):e0324646. doi: 10.1371/journal.pone.0324646. eCollection 2025.

Air temperature estimation and modeling using data driven techniques based on best subset regression model in Egypt.

Sci Rep. 2025 Jun 20;15(1):20200. doi: 10.1038/s41598-025-06277-2.

CNSW 1.0: Prefectural Reconstruction of China's Surface Water Resources Using Machine Learning Methods.

Sci Data. 2025 Jun 19;12(1):1032. doi: 10.1038/s41597-025-05389-8.

Machine learning-based drought prediction using Palmer Drought Severity Index and TerraClimate data in Ethiopia.

PLoS One. 2025 Jun 18;20(6):e0326174. doi: 10.1371/journal.pone.0326174. eCollection 2025.

Integration of Gaussian process regression and K means clustering for enhanced short term rainfall runoff modeling.

Sci Rep. 2025 Mar 3;15(1):7444. doi: 10.1038/s41598-025-91339-8.

Establishment and Evaluation of Atmospheric Water Vapor Inversion Model Without Meteorological Parameters Based on Machine Learning.

Sensors (Basel). 2025 Jan 12;25(2):420. doi: 10.3390/s25020420.

Comparative assessment of empirical and hybrid machine learning models for estimating daily reference evapotranspiration in sub-humid and semi-arid climates.

Sci Rep. 2025 Jan 20;15(1):2542. doi: 10.1038/s41598-024-83859-6.

Optimising Venturi flume oxygen transfer efficiency using uncertainty-aware decision trees.

Water Sci Technol. 2024 Dec;90(12):3210-3240. doi: 10.2166/wst.2024.393.

Screening the Best Risk Model and Susceptibility SNPs for Chronic Obstructive Pulmonary Disease (COPD) Based on Machine Learning Algorithms.

Int J Chron Obstruct Pulmon Dis. 2024 Nov 5;19:2397-2414. doi: 10.2147/COPD.S478634. eCollection 2024.

Hybrid modeling approaches for agricultural commodity prices using CEEMDAN and time delay neural networks.

Sci Rep. 2024 Nov 4;14(1):26639. doi: 10.1038/s41598-024-74503-4.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于随机森林（RF）、随机树（RT）和高斯过程回归（GPR）模型的气象干旱预测及标准化降水指数

Prediction of meteorological drought and standardized precipitation index based on the random forest (RF), random tree (RT), and Gaussian process regression (GPR) models.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献