Mebirouk Nadjib, Amrane Moussa, Messast Salah, Ayadat Tahar
Civil Engineering Department, Faculty of Technology, Laboratory LMGHU, University 20 Août 1955-Skikda, Skikda, Algeria.
Civil Engineering Department, Faculty of Technology, Laboratory LGC-ROI, University of Batna 2 - Mostefa Ben Boulaid, 53, Constantine Road, Fesdis, 05078, Batna, Algeria.
Environ Sci Pollut Res Int. 2025 Jun;32(30):18434-18460. doi: 10.1007/s11356-025-36761-1. Epub 2025 Jul 24.
Landslide susceptibility mapping has become an essential task to ensure economic and social sustainability. The use of machine learning algorithms has seen a wide range of applications and demonstrated high performance. However, researchers often face the challenge of validating these models or selecting the best one among them. This research emphasizes the importance of multi-criteria evaluation in assessing the performance of three ensemble learning models, namely gradient boosting classifier (GBC), light gradient boosting machine (LGBM), and extreme gradient boosting (XGBoost), used to produce a landslide susceptibility map (LSM), focusing on the Oued Guebli watershed (Northwestern region of Skikda, Algeria). A comprehensive database was created, incorporating a landslide inventory of 284 points and eight causality factors, including lithology, slope, NDVI, TWI, land use, along with distance to roads, watercourses, and geological faults, which was then split into a training set (70%) and a test set (30%). The performance of the models was assessed using classical evaluation metrics. The results indicate that all models exhibited similar performance, achieving high accuracy (0.9884), precision (0.9886), specificity (1.00), sensitivity (0.9884), F1-score (0.9884), RMSE (0.1078), and Pearson's correlation R (0.9770), highlighting the need to adopt complementary evaluation methods to distinguish subtle differences between these models; in this context, this study employs additional validation techniques, including the area under the curve (AUC) value obtained by plotting the receiver operating characteristic (ROC) curve, which revealed significant differences in model performance, with GBC achieving the best performance with an AUC value of 0.9911, followed by XGBoost at 0.9891, and LGBM at 0.9794. Furthermore, spatial validation, an innovative method used in this study, is based on the percentage of landslides predicted by the models in the very high susceptibility class, showing that the GBC model achieved the highest rate at 99.30%, followed by XGBoost at 97.18%, while LGBM recorded the lowest rate at 88.03%. Additionally, the study incorporated the mean absolute error (MAE) to enhance the evaluation of the model's robustness, with results of 0.0039 for GBC, 0.0371 for XGBoost, and 0.1610 for LGBM, further confirming GBC as the most performant model according to all three validation techniques utilized. Selecting a high-performing model is essential for accurate LSMs, ensuring reliable predictions for risk assessment and disaster prevention. The integration of multiple validation techniques strengthens model robustness and enhances its applicability in resident safety, infrastructure preservation, and effective land-use planning within the Oued Guebli watershed.
滑坡易发性制图已成为确保经济和社会可持续发展的一项重要任务。机器学习算法的应用范围广泛,并展现出了高性能。然而,研究人员在验证这些模型或从中选择最佳模型时常常面临挑战。本研究强调了多标准评估在评估三种集成学习模型性能方面的重要性,这三种模型分别是梯度提升分类器(GBC)、轻量级梯度提升机(LGBM)和极端梯度提升(XGBoost),用于生成滑坡易发性地图(LSM),重点关注瓦迪盖布利流域(阿尔及利亚斯基克达西北部地区)。创建了一个综合数据库,纳入了284个点的滑坡清单和八个因果因素,包括岩性、坡度、归一化植被指数(NDVI)、地形湿度指数(TWI)、土地利用,以及到道路、水道和地质断层的距离,然后将其分为训练集(70%)和测试集(30%)。使用经典评估指标对模型的性能进行评估。结果表明,所有模型表现出相似的性能,达到了较高的准确率(0.9884)、精确率(0.9886)、特异性(1.00)、灵敏度(0.9884)、F1分数(0.9884)、均方根误差(RMSE,0.1078)和皮尔逊相关系数R(0.9770),这突出表明需要采用补充评估方法来区分这些模型之间的细微差异;在此背景下,本研究采用了额外的验证技术,包括通过绘制接收者操作特征(ROC)曲线获得的曲线下面积(AUC)值,结果显示模型性能存在显著差异,GBC的AUC值为0.9911,表现最佳,其次是XGBoost,为0.9891,LGBM为0.9794。此外,空间验证是本研究中使用的一种创新方法,基于模型预测的极高易发性类别中的滑坡百分比,结果表明GBC模型的比例最高,为99.30%,其次是XGBoost,为97.18%,而LGBM的比例最低,为88.03%。此外,该研究纳入了平均绝对误差(MAE)以加强对模型稳健性的评估,GBC的结果为0.0039,XGBoost为0.0371,LGBM为0.1610,这进一步证实了根据所使用的所有三种验证技术,GBC是性能最佳的模型。选择一个高性能模型对于准确的LSM至关重要,可确保对风险评估和灾害预防进行可靠预测。多种验证技术的整合增强了模型的稳健性,并提高了其在瓦迪盖布利流域居民安全、基础设施保护和有效土地利用规划中的适用性。