Dos Santos Arthur Pereira, da Silva Junior Alessandro Xavier, Nery Liliane Moreira, Gomes Gabriela, Toniolo Bruno Pereira, da Cunha E Silva Darllan Collins, Lourenço Roberto Wagner
Department of Environmental Science, São Paulo State University (UNESP), Sorocaba, São Paulo, Brazil.
Institute of Science and Technology of Sorocaba, São Paulo State University (UNESP), Sorocaba, São Paulo, Brazil.
Environ Monit Assess. 2025 Feb 26;197(3):330. doi: 10.1007/s10661-025-13786-0.
The proportion of sand, silt, and clay defines soil texture, significantly influencing agricultural and ecological practices. However, conventional classification methods are costly and limit evaluation frequency and scope. In contrast, machine learning algorithms, such as random forest, provide a more efficient solution for accurate soil texture predictions. This study aims to address this knowledge gap by integrating geoprocessing, precision agriculture, and machine learning to classify soil texture in the Sorocabuçu River Basin (SRB), predominantly agricultural. Twenty-seven sampling points were selected based on topography and land use, ensuring the representativeness of area variations and the reliability of classification. Granulometric analysis was performed using the pipette method to separate sand, silt, and clay. The data were spatially interpolated using geographic information system (GIS) techniques. Soil texture was classified using the random forest algorithm, trained on 70% of the data and tested on 30%, evaluating overall accuracy, kappa index, sensitivity, and specificity. Fifty trees (ntree) and four features per split (ntry) were used, considering the variability of parameters to ensure satisfactory results. The varied spatial distribution of clay, along with high levels of sand and silt, suggests greater vulnerability to erosion without conservation management practices. The random forest model achieved an out-of-bag (OOB) error of 2.78%, a kappa index of 0.88, and an overall accuracy of 0.92, demonstrating excellent predictive capacity. The variability of sand was essential, but the Sandy Clay Loam (SCL) class posed challenges due to its intermediate characteristics between sand and clay, resulting in classification overlaps. This integrated methodology enhances understanding of soil structure in the SRB and provides a foundation for future research and practical applications, supporting food security and environmental sustainability. The model can be applied in other locations and agricultural contexts. In homogeneous soils, the method can be improved through the application of machine learning algorithms to enhance accuracy.
砂、粉砂和黏土的比例决定了土壤质地,对农业和生态实践有重大影响。然而,传统的分类方法成本高昂,限制了评估频率和范围。相比之下,随机森林等机器学习算法为准确预测土壤质地提供了更有效的解决方案。本研究旨在通过整合地理处理、精准农业和机器学习来填补这一知识空白,对以农业为主的索罗卡布苏河流域(SRB)的土壤质地进行分类。基于地形和土地利用选择了27个采样点,以确保区域变化的代表性和分类的可靠性。采用移液管法进行粒度分析,以分离砂、粉砂和黏土。利用地理信息系统(GIS)技术对数据进行空间插值。使用随机森林算法对土壤质地进行分类,用70%的数据进行训练,30%的数据进行测试,评估总体准确率、kappa指数、敏感性和特异性。考虑到参数的变异性,使用了50棵树(ntree)和每次分裂4个特征(ntry),以确保获得满意的结果。黏土的空间分布各异,同时砂和粉砂含量较高,这表明在没有保护管理措施的情况下,土壤更容易受到侵蚀。随机森林模型的袋外(OOB)误差为2.78%,kappa指数为0.88,总体准确率为0.92,显示出出色的预测能力。砂的变异性至关重要,但砂质黏壤土(SCL)类别因其介于砂和黏土之间的中间特性而带来挑战,导致分类重叠。这种综合方法增强了对SRB土壤结构的理解,并为未来的研究和实际应用奠定了基础,支持粮食安全和环境可持续性。该模型可应用于其他地点和农业环境。在均质土壤中,可通过应用机器学习算法来提高该方法的准确性。