Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei 230031, China; Intelligent Agriculture Engineering Laboratory of Anhui Province, Hefei 230031, China.
Spectrochim Acta A Mol Biomol Spectrosc. 2022 Dec 15;283:121707. doi: 10.1016/j.saa.2022.121707. Epub 2022 Aug 9.
Variable selection is widely accepted as an important step in the quantitative analysis of visible and near-infrared (Vis-NIR) spectroscopy, as it tends to improve the model's robustness and predictive ability. In this study, a total of 140 lime concretion black soil samples were collected from two towns in Guoyang County, China. The Vis-NIR spectra measured in the laboratory were used to estimate soil pH by an extreme learning machine (ELM). First, the soil spectra were treated by the optimized continuous wavelet transform (CWT), and then four spectral feature selection methods (competitive adaptive reweighted sampling, CARS; successive projections algorithm, SPA; Monte Carlo uninformative variable elimination, MCUVE; genetic algorithm, GA) were applied with ELM in the CWT domain to determine the techniques with most predictions. For comparison, The PLS and SVM models were also developed. The coefficient of determination (R), root mean square error (RMSE), and residual prediction deviation (RPD) were used to evaluate the model performance. Based on the validation dataset, the performance of the ELM models was superior to that of the PLS and SVM models expect SPA and MCUVE. In the ELM models, the order of the prediction accuracy was GA-ELM (R = 0.86; RMSE = 0.1484; RPD = 2.64), CARS-ELM (R = 0.84; RMSE = 0.1565; RPD = 2.50), ELM (R = 0.84; RMSE = 0.1572; RPD = 2.49), SPA-ELM (R = 0.84; RMSE = 0.1589; RPD = 2.47) and MCUVE-ELM (R = 0.83; RMSE = 0.1599; RPD = 2.45). The proposed method of CARS-ELM had a relatively strong ability for spectral variable selection while retaining excellent prediction accuracy and short computing time (0.39 s). In addition, the variables selected by the four methods (CARS, SPA, MCUVE and GA) indicated the prediction mechanism for pH in lime concretion black soil may be the relation between pH and iron oxides and organic matter. In conclusion, CARS-ELM has great potential to accurately determine the pH in lime concretion black soil using Vis-NIR spectroscopy.
变量选择被广泛认为是可见近红外(Vis-NIR)光谱定量分析的重要步骤,因为它可以提高模型的稳健性和预测能力。本研究从中国固阳县的两个镇共采集了 140 个石灰结核黑土样品。在实验室中测量的 Vis-NIR 光谱用于通过极限学习机(ELM)估算土壤 pH 值。首先,通过优化连续小波变换(CWT)处理土壤光谱,然后在 CWT 域中应用四种光谱特征选择方法(竞争自适应重加权采样,CARS;连续投影算法,SPA;蒙特卡罗无信息变量消除,MCUVE;遗传算法,GA)与 ELM 结合,以确定具有最佳预测效果的技术。为了进行比较,还建立了 PLS 和 SVM 模型。采用决定系数(R)、均方根误差(RMSE)和残差预测偏差(RPD)来评估模型性能。基于验证数据集,ELM 模型的性能优于 PLS 和 SVM 模型,但 SPA 和 MCUVE 除外。在 ELM 模型中,预测精度的顺序依次为 GA-ELM(R=0.86;RMSE=0.1484;RPD=2.64)、CARS-ELM(R=0.84;RMSE=0.1565;RPD=2.50)、ELM(R=0.84;RMSE=0.1572;RPD=2.49)、SPA-ELM(R=0.84;RMSE=0.1589;RPD=2.47)和 MCUVE-ELM(R=0.83;RMSE=0.1599;RPD=2.45)。CARS-ELM 方法具有较强的光谱变量选择能力,同时保持了出色的预测精度和较短的计算时间(0.39s)。此外,四种方法(CARS、SPA、MCUVE 和 GA)选择的变量表明,石灰结核黑土 pH 值的预测机制可能是 pH 值与铁氧化物和有机物之间的关系。总之,CARS-ELM 具有通过 Vis-NIR 光谱准确测定石灰结核黑土 pH 值的巨大潜力。