Mundada Shyamal, Jain Pooja
Department of Computer Science and Engineering, Indian Institute of Information Technology, Nagpur, 441108, India.
Sci Rep. 2025 Aug 6;15(1):28760. doi: 10.1038/s41598-025-09804-3.
Soil plays a major role in the agricultural system. Soil composition detection can help farmers to take appropriate decision leading to proper crop growth. Soil organic carbon is crucial for many soil activities and ecological characteristics, is at the centre of sustainable agriculture. The goal of the research is to create a system for evaluating soil organic carbon based on topographic features and soil properties incorporating machine learning algorithms. A group of covariates has been chosen to function as potential predictor factors for soil properties, including four topographical variables, two soil-related remote sensing indices, and four climate variables which were retrieved from satellite images. Along with predictor variables, soil health card data as dependent variable was used for training the model. It was notified that bagging and boosting showed good results for training than for testing. XGBoost algorithm noted highest R as 0.95 and lowest RMSE as 0.03 with sMAPE as 0.04 while using Random Forest it was identified that R was 0.86, RMSE was 0.06 and sMAPE was 0.08. For testing dataset, RMSE ranged between 0.15 and 0.16 while sMAPE recorded as 0.19-0.20 and R was recorded as 0.12 for Random Forest and 0.03 for XGBoost algorithm. Stacking method proved its significance prominently in overcoming the problem of overfitting as compared to other two methods. For stacking method, R was recorded low having numeric value for training dataset as 0.17 and testing dataset as 0.07 but RMSE for both datasets was nearly same, as 0.16 and sMAPE as 0.18-0.20. This system will assist farmers in making decisions about applying fertilizer precisely which will increase crop yield. Application of ML techniques on remote sensing data can help to build a decision support system in precision farming for improving crop yield.
土壤在农业系统中起着重要作用。土壤成分检测有助于农民做出适当决策,促进作物健康生长。土壤有机碳对许多土壤活动和生态特性至关重要,是可持续农业的核心。本研究的目标是创建一个基于地形特征和土壤特性并结合机器学习算法的土壤有机碳评估系统。已选择一组协变量作为土壤特性的潜在预测因子,包括四个地形变量、两个与土壤相关的遥感指数以及从卫星图像中获取的四个气候变量。除预测变量外,还将土壤健康卡数据作为因变量用于训练模型。结果表明,装袋法和提升法在训练时的效果优于测试时。使用XGBoost算法时,最高相关系数R为0.95,最低均方根误差RMSE为0.03,对称平均绝对百分比误差sMAPE为0.04;而使用随机森林算法时,R为0.86,RMSE为0.06,sMAPE为0.08。对于测试数据集,随机森林算法的RMSE在0.15至0.16之间,sMAPE记录为0.19 - 0.20,R记录为0.12;XGBoost算法的RMSE为0.03,sMAPE为0.19 - 0.20,R为0.03。与其他两种方法相比,堆叠法在克服过拟合问题方面表现出显著优势。对于堆叠法,训练数据集的R记录较低,数值为0.17,测试数据集的R为0.07,但两个数据集的RMSE几乎相同,均为0.16,sMAPE为0.18 - 0.20。该系统将帮助农民精确决策施肥,从而提高作物产量。将机器学习技术应用于遥感数据有助于在精准农业中建立决策支持系统,以提高作物产量。