Singh Ankit, Dhiman Nitesh, K C Niraj, Shukla Dericks Praise
DExtER Lab, School of Civil and Environmental Engineering, North Campus, IIT Mandi, A-11 Building, Mandi, 175075, Himachal Pradesh, India.
Department of Geomatics Engineering, Tribhuvan University Pashchimanchal Campus, Pokhara, Nepal.
Environ Sci Pollut Res Int. 2024 Sep 2. doi: 10.1007/s11356-024-34726-4.
Developing effective strategies to predict areas susceptible to landslides and reducing risk is vital. This involves using ensemble methods to meet the precise prediction and addressing challenges like data limitation. Recent studies have highlighted the potential of using ensemble methods to enhance the prediction of landslide susceptibility maps (LSM). Ensemble methods present a sampling of landslides and non-landslide points from high and low susceptible areas, respectively. Extensive research has explored their application in machine learning processes, particularly in classification-related problems. This study delves into strategies of ensemble as a promising method in future landslide applications. The proposed method was tested considering Kangra district of Himachal Pradesh as study area where three datasets were prepared consisting of presence and absence points. Dataset 1 consisted of initial landslide and randomly generated non-landslide points. In dataset 2, additional landslide points obtained from the very high susceptibility of initial LSM were supplemented with initial landslide data, while the non-landslide points were generated randomly from the study area. Finally, dataset 3 was composed of the landslide points as in dataset 2, and the non-landslide points were obtained from the very low susceptible areas of initial LSM. These datasets are used with random forest (RF) and support vector machine (SVM), thereby preparing six landslide susceptibility maps. To analyze the applicability of the proposed method, we have used metrics such as AUC-ROC, precision, recall, F-score, accuracy and Mathew's correlation coefficient (MCC). The AUC for dataset 1 with SVM and RF is 0.89, which increased to 0.898 and 0.952 for datasets 2 and 3 with SVM and 0.937 and 0.954 with RF. Among all the methods, the precision and recall values were highest for dataset 3 with SVM as well as RF. Hence, based on several accuracy metrics, we conclude that when the landslides and non-landslides samples were sampled from very high and very low susceptible areas respectively, the LSM performed better than all the other methods. Sampling landslides from very high susceptible areas only (dataset 2) does not perform well thereby committing misclassification error. The study demonstrated that the landslide and non-landslide data were obtained from very high and very low susceptibility; the predictive capability of the LSM increased significantly. Thus, the results demonstrate the effectiveness of the proposed ensemble approach in providing precise delineation of landslide zones, facilitating informed decision-making for land and hazard management.
制定有效的策略来预测易发生山体滑坡的区域并降低风险至关重要。这涉及使用集成方法来实现精确预测,并应对数据限制等挑战。最近的研究强调了使用集成方法来提高滑坡易发性地图(LSM)预测的潜力。集成方法分别从高易发性和低易发性区域对滑坡点和非滑坡点进行采样。广泛的研究探讨了它们在机器学习过程中的应用,特别是在与分类相关的问题中。本研究深入探讨了集成策略作为未来滑坡应用中一种有前景的方法。考虑到喜马偕尔邦的康格拉地区作为研究区域对所提出的方法进行了测试,在该区域准备了三个数据集,包括存在点和不存在点。数据集1由初始滑坡点和随机生成的非滑坡点组成。在数据集2中,从初始LSM的极高易发性区域获得的额外滑坡点与初始滑坡数据相结合,而非滑坡点则从研究区域随机生成。最后,数据集3由与数据集2相同的滑坡点组成,非滑坡点从初始LSM的极低易发性区域获得。这些数据集与随机森林(RF)和支持向量机(SVM)一起使用,从而生成六幅滑坡易发性地图。为了分析所提出方法的适用性,我们使用了诸如AUC-ROC、精度、召回率、F分数、准确率和马修斯相关系数(MCC)等指标。数据集1与SVM和RF的AUC为0.89,对于数据集2和3,与SVM时分别增加到0.898和0.952,与RF时分别为0.937和0.954。在所有方法中,数据集3与SVM以及RF的精度和召回率值最高。因此,基于几个准确性指标,我们得出结论,当分别从极高和极低易发性区域对滑坡和非滑坡样本进行采样时,LSM的表现优于所有其他方法。仅从极高易发性区域采样滑坡(数据集2)表现不佳,从而产生误分类错误。该研究表明,滑坡和非滑坡数据分别从极高和极低易发性区域获得时,LSM的预测能力显著提高。因此,结果证明了所提出的集成方法在精确划定滑坡区域方面的有效性,有助于为土地和灾害管理做出明智的决策。