Department of Agricultural Engineering, Institute of Agriculture, Visva-Bharati (A Central University), Sriniketan, Birbhum, West Bengal, 731236, India.
Water Resources Engineering and Management Institute, Faculty of Technology & Engineering, The Maharaja Sayajirao University of Baroda, Gujarat, India.
Environ Sci Pollut Res Int. 2024 Jul;31(35):48497-48522. doi: 10.1007/s11356-024-34286-7. Epub 2024 Jul 20.
Flooding is a major natural hazard worldwide, causing catastrophic damage to communities and infrastructure. Due to climate change exacerbating extreme weather events robust flood hazard modeling is crucial to support disaster resilience and adaptation. This study uses multi-sourced geospatial datasets to develop an advanced machine learning framework for flood hazard assessment in the Arambag region of West Bengal, India. The flood inventory was constructed through Sentinel-1 SAR analysis and global flood databases. Fifteen flood conditioning factors related to topography, land cover, soil, rainfall, proximity, and demographics were incorporated. Rigorous training and testing of diverse machine learning models, including RF, AdaBoost, rFerns, XGB, DeepBoost, GBM, SDA, BAM, monmlp, and MARS algorithms, were undertaken for categorical flood hazard mapping. Model optimization was achieved through statistical feature selection techniques. Accuracy metrics and advanced model interpretability methods like SHAP and Boruta were implemented to evaluate predictive performance. According to the area under the receiver operating characteristic curve (AUC), the prediction accuracy of the models performed was around > 80%. RF achieves an AUC of 0.847 at resampling factor 5, indicating strong discriminative performance. AdaBoost also consistently exhibits good discriminative ability, with AUC values of 0.839 at resampling factor 10. Boruta and SHAP analysis indicated precipitation and elevation as factors most significantly contributing to flood hazard assessment in the study area. Most of the machine learning models pointed out southern portions of the study area as highly susceptible areas. On average, from 17.2 to 18.6% of the study area is highly susceptible to flood hazards. In the feature selection analysis, various nature-inspired algorithms identified the selected input parameters for flood hazard assessment, i.e., elevation, precipitation, distance to rivers, TWI, geomorphology, lithology, TRI, slope, soil type, curvature, NDVI, distance to roads, and gMIS. As per the Boruta and SHAP analyses, it was found that elevation, precipitation, and distance to rivers play the most crucial roles in the decision-making process for flood hazard assessment. The results indicated that the majority of the building footprints (15.27%) are at high and very high risk, followed by those at very low risk (43.80%), low risk (24.30%), and moderate risk (16.63%). Similarly, the cropland area affected by flooding in this region is categorized into five risk classes: very high (16.85%), high (17.28%), moderate (16.07%), low (16.51%), and very low (33.29%). However, this interdisciplinary study contributes significantly towards hydraulic and hydrological modeling for flood hazard management.
洪水是全球范围内的主要自然灾害之一,给社区和基础设施造成了灾难性的破坏。由于气候变化加剧了极端天气事件,因此稳健的洪水灾害建模对于支持灾害恢复力和适应性至关重要。本研究使用多源地理空间数据集,为印度西孟加拉邦的阿拉姆巴格地区开发了一种先进的洪水灾害评估机器学习框架。洪水清单是通过 Sentinel-1 SAR 分析和全球洪水数据库构建的。纳入了与地形、土地覆盖、土壤、降雨、接近度和人口统计学相关的 15 个洪水条件因素。针对洪水灾害的分类制图,对包括 RF、AdaBoost、rFerns、XGB、DeepBoost、GBM、SDA、BAM、monmlp 和 MARS 等多种机器学习模型进行了严格的训练和测试。通过统计特征选择技术实现了模型优化。采用 SHAP 和 Boruta 等高级模型可解释性方法来评估预测性能,实现了准确性指标。根据接收器工作特征曲线下的面积(AUC),模型的预测精度约为>80%。RF 在重采样因子为 5 时的 AUC 达到 0.847,表明具有较强的判别性能。AdaBoost 也表现出较好的判别能力,在重采样因子为 10 时的 AUC 值为 0.839。Boruta 和 SHAP 分析表明,降水和海拔是对研究区洪水灾害评估贡献最大的因素。大多数机器学习模型都指出研究区的南部地区是高度易受灾地区。平均而言,研究区有 17.2%至 18.6%的地区极易受到洪水灾害的影响。在特征选择分析中,各种受自然启发的算法确定了洪水灾害评估的选定输入参数,即海拔、降水、到河流的距离、TWI、地貌、岩性、TRI、坡度、土壤类型、曲率、NDVI、到道路的距离和 gMIS。根据 Boruta 和 SHAP 分析,发现海拔、降水和到河流的距离在洪水灾害评估的决策过程中起着至关重要的作用。结果表明,大部分建筑物占地面积(15.27%)处于高风险和极高风险,其次是低风险(43.80%)、低风险(24.30%)和中风险(16.63%)。同样,该地区受洪水影响的耕地面积也分为五个风险类别:极高风险(16.85%)、高风险(17.28%)、中风险(16.07%)、低风险(16.51%)和极低风险(33.29%)。然而,这项跨学科研究为洪水灾害管理的水力和水文模型提供了重要贡献。