İnci Yüsra, Bilgili Ali Volkan, Gündoğan Recep, Gözükara Gafur, Karadağ Kerim, Tenekeci Mehmet Emin
Organized Industrial Zone Vocational School, Harran University, Sanliurfa 63300, Türkiye.
Department of Soil Science and Plant Nutrition, Faculty of Agriculture, Harran University, Sanliurfa 63300, Türkiye.
Sensors (Basel). 2024 Aug 7;24(16):5126. doi: 10.3390/s24165126.
In soil science, the allocation of soil samples to their respective origins holds paramount significance, as it serves as a crucial investigative tool. In recent times, with the increasing use of proximal sensing and advancements in machine-learning techniques, new approaches have accompanied these developments, enhancing the effectiveness of soil utilization in soil science. This study investigates soil classification based on four parent materials. For this purpose, a total of 59 soil samples were collected from 12 profiles and the vicinity of each profile at a depth of 0-30 cm. Surface soil samples were analyzed for elemental concentrations using X-Ray fluorescence (XRF) and inductively coupled plasma-optical emission spectrometry (ICP-OES) and soil spectra using a visible near-infrared (Vis-NIR) spectrometer. Soil samples collected from soil profiles (12 soil samples) and surface (47 soil samples) were used to classify parent materials using machine learning-based algorithms such as Support Vector Machine (SVM), Ensemble Subspace k-Near Neighbor (ESKNN), and Ensemble Bagged Trees (EBTs). Additionally, as a validation of the classification techniques, the dataset was subjected to five-fold cross-validation and independent sample set splitting (80% calibration and 20% validation). Evaluation metrics such as accuracy, F score, and G mean were used to evaluate prediction performance. Depending on the dataset and algorithm used, the classification success rates varied between 70% and 100%. Overall, the ESKNN (99%) produced better results than other classification methods. Additionally, Relief algorithms were employed to identify key variables for each dataset (ICP-OES: CaO, FeO, AlO, MgO, and MnO; XRF: SiO, CaO, FeO, AlO, and MnO; Vis-NIR: 567, 571, 572, 573, and 574 nm). Subsequent soil reclassification using these reduced variables revealed reduced accuracies using Vis-NIR data, with ESKNN still yielding the best results.
在土壤科学中,将土壤样本与其各自的来源进行匹配具有至关重要的意义,因为它是一种关键的调查工具。近年来,随着近程传感技术的日益普及以及机器学习技术的进步,新的方法应运而生,这些新方法伴随着这些发展,提高了土壤科学中土壤利用的效率。本研究基于四种母质对土壤进行分类。为此,从12个土壤剖面及其每个剖面附近0至30厘米深度处共采集了59个土壤样本。使用X射线荧光光谱仪(XRF)和电感耦合等离子体发射光谱仪(ICP - OES)分析表层土壤样本的元素浓度,并使用可见近红外光谱仪(Vis - NIR)测量土壤光谱。从土壤剖面采集的土壤样本(12个土壤样本)和表层土壤样本(47个土壤样本)用于使用基于机器学习的算法,如支持向量机(SVM)、集成子空间k近邻算法(ESKNN)和集成袋装树算法(EBTs)对母质进行分类。此外,作为对分类技术的验证,数据集进行了五折交叉验证和独立样本集划分(80%用于校准,20%用于验证)。使用诸如准确率、F分数和G均值等评估指标来评估预测性能。根据所使用的数据集和算法,分类成功率在70%至100%之间变化。总体而言,ESKNN算法(99%)比其他分类方法产生了更好的结果。此外,采用Relief算法为每个数据集确定关键变量(ICP - OES:CaO、FeO、AlO、MgO和MnO;XRF:SiO、CaO、FeO、AlO和MnO;Vis - NIR:567、571、572、573和574纳米)。使用这些减少后的变量进行后续土壤重新分类显示,使用Vis - NIR数据时准确率有所降低,ESKNN算法仍然产生最佳结果。