School of Civil Engineering, Architecture and Environment, Hubei University of Technology, Wuhan, Hubei Province, People's Republic of China.
Innovation Demonstration Base of Ecological Environment Geotechnical and Ecological Restoration of Rivers and Lakes, Hubei University of Technology, Wuhan, Hubei Province, People's Republic of China.
Sci Rep. 2023 Apr 10;13(1):5823. doi: 10.1038/s41598-023-33186-z.
The Zigui-Badong section of the Three Gorges Reservoir area is used as the research area in this study to research the impact of unbalanced sample sets on Landslide Susceptibility Mapping (LSM) and determine the sample ratio interval with the best performance for different models. We employ 12 LSM factors, five training sample sets with different sample ratios (1:1, 1:2, 1:4, 1:8, and 1:16), and C5.0, Support Vector Machine (SVM), Logistic Regression (LR), and one-dimensional Convolution Neural Network (CNN) models are used to obtain landslide susceptibility index and landslide susceptibility zoning in the study area, respectively. The prediction performance of the model is evaluated by the receiver operating characteristic curve area under the curve value, five statistical methods, and specific category precision. The results show that the CNN, SVM, and LR models in the sample ratio of 1:2 achieve better performance than on the balanced sample set, which indicates the importance of the unbalanced sample set in training the LSM modeling. The C5.0 model is always in a state of overfitting in this study and needs to be further studied. The conclusions put forward in this study help improve the scientificity and reliability of LSM.
本研究以三峡库区秭归-巴东段为研究区,研究不平衡样本集对滑坡易发性制图(LSM)的影响,并确定不同模型性能最佳的样本比例区间。我们采用了 12 个 LSM 因子、五个具有不同样本比例(1:1、1:2、1:4、1:8 和 1:16)的训练样本集,使用 C5.0、支持向量机(SVM)、逻辑回归(LR)和一维卷积神经网络(CNN)模型,分别获得研究区的滑坡易发性指数和滑坡易发性分区。通过接收者操作特征曲线下的曲线值、五种统计方法和特定类别精度来评估模型的预测性能。结果表明,在样本比例为 1:2 时,CNN、SVM 和 LR 模型的性能优于平衡样本集,这表明不平衡样本集在训练 LSM 模型中的重要性。在本研究中,C5.0 模型始终处于过拟合状态,需要进一步研究。本研究提出的结论有助于提高 LSM 的科学性和可靠性。