Mosaid Hassan, Barakat Ahmed, John Kingsley, Faouzi Elhousna, Bustillo Vincent, El Garnaoui Mohamed, Heung Brandon
Geomatics, Georesources and Environment Laboratory, Faculty of Sciences and Techniques, Sultan Moulay Slimane University, Béni Mellal, Morocco.
Department of Plant, Food, and Environmental Sciences, Faculty of Agriculture, Dalhousie University, Truro, NS, B2N 5E3, Canada.
Environ Monit Assess. 2024 Jan 10;196(2):130. doi: 10.1007/s10661-024-12294-x.
Soil serves as a reservoir for organic carbon stock, which indicates soil quality and fertility within the terrestrial ecosystem. Therefore, it is crucial to comprehend the spatial distribution of soil organic carbon stock (SOCS) and the factors influencing it to achieve sustainable practices and ensure soil health. Thus, the present study aimed to apply four machine learning (ML) models, namely, random forest (RF), k-nearest neighbors (kNN), support vector machine (SVM), and Cubist model tree (Cubist), to improve the prediction of SOCS in the Srou catchment located in the Upper Oum Er-Rbia watershed, Morocco. From an inventory of 120 sample points, 80% were used for training the model, with the remaining 20% set aside for model testing. Boruta's algorithm and the multicollinearity test identified only nine (9) factors as the controlling factors selected as input data for predicting SOCS. As a result, spatial distribution maps for SOCS were generated for all models, then compared, and further validated using statistical metrics. Among the models tested, the RF model exhibited the best performance (R = 0.76, RMSE = 0.52 Mg C/ha, NRMSE = 0.13, and MAE = 0.34 Mg C/ha), followed closely by the SVM model (R = 0.68, RMSE = 0.59 Mg C/ha, NRMSE = 0.15, and MAE = 0.34 Mg C/ha) and Cubist model (R = 0.64, RMSE = 0.63 Mg C/ha, NRMSE = 0.16, and MAE = 0.43 Mg C/ha), while the kNN model had the lowest performance (R = 0.31, RMSE = 0.94 Mg C/ha, NRMSE = 0.24, and MAE = 0.63 Mg C/ha). However, bulk density, pH, electrical conductivity, and calcium carbonate were the most important factors for spatially predicting SOCS in this semi-arid region. Hence, the methodology used in this study, which relies on ML algorithms, holds the potential for modeling and mapping SOCS and soil properties in comparable contexts elsewhere.
土壤是有机碳储量的储存库,它反映了陆地生态系统中的土壤质量和肥力。因此,了解土壤有机碳储量(SOCS)的空间分布及其影响因素对于实现可持续发展和确保土壤健康至关重要。因此,本研究旨在应用四种机器学习(ML)模型,即随机森林(RF)、k近邻(kNN)、支持向量机(SVM)和Cubist模型树(Cubist),以改进对位于摩洛哥上乌姆埃尔-拉比阿河流域的斯鲁集水区SOCS的预测。在120个采样点的清单中,80%用于训练模型,其余20%留作模型测试。Boruta算法和多重共线性检验仅确定了九个(9)因素作为预测SOCS的控制因素,并将其作为输入数据。结果,为所有模型生成了SOCS的空间分布图,然后进行比较,并使用统计指标进一步验证。在测试的模型中,RF模型表现最佳(R = 0.76,RMSE = 0.52 Mg C/ha,NRMSE = 0.13,MAE = 0.34 Mg C/ha),紧随其后的是SVM模型(R = 0.68,RMSE = 0.59 Mg C/ha,NRMSE = 0.15,MAE = 0.34 Mg C/ha)和Cubist模型(R = 0.64,RMSE = 0.63 Mg C/ha,NRMSE = 0.16,MAE = 0.43 Mg C/ha),而kNN模型表现最差(R = 0.31,RMSE = 0.94 Mg C/ha,NRMSE = 0.24,MAE = 0.63 Mg C/ha)。然而,容重、pH值、电导率和碳酸钙是该半干旱地区空间预测SOCS的最重要因素。因此,本研究中使用的依赖于ML算法方法,在其他类似环境中对SOCS和土壤性质进行建模和制图具有潜力。