Huang Qing, Zhu Mingdong, Xu Zhenyu, Kan Ruifeng
School of Environmental Science and Optoelectronic Technology, University of Science and Technology of China, Hefei, 230026, Anhui, China; Anhui Institute of Optics and Fine Mechanics, Hefei Institute of Physical Science, Chinese Academy of Sciences, Hefei, 230031, China.
School of Environmental Science and Optoelectronic Technology, University of Science and Technology of China, Hefei, 230026, Anhui, China; State Key Laboratory of Hybrid Rice, Hunan Rice Research Institute, Hunan Academy of Agricultural Sciences, Changsha, 410125, China.
Anal Chim Acta. 2024 Oct 16;1326:343153. doi: 10.1016/j.aca.2024.343153. Epub 2024 Aug 27.
Wavelength selection is one of the key steps in spectral analysis and plays an irreplaceable role in improving model prediction accuracy and computational efficiency. High-dimensional spectral datasets contain substantial irrelevant information and redundant variables. Whereas, at current stage, such problem can be solved by existing abundant wavelength selection methods. However, it is difficult to achieve the balance between strong wavelength interpretability and prediction accuracy by those methods. As a result, there is an urgent need for a new method that can reach the point of balance.
we propose a new framework for wavelength selection based on wavelength importance clustering (WIC) which attempts to establish a hierarchical relationship between wavelength points and attributions of response through a clustering algorithm, consequently, performing combinatorial and filtering to obtain the optimal wavelength combinations. In this paper, a new wavelength selection method (WIC-WRCKF) is constructed based on WIC, and four commonly used wavelength selection methods are selected to be compared with WIC-WRCKF. A large number of experiments are carried out on three publicly available datasets as well, namely, wheat, corn, and tablets. Compared with other methods, WIC-WRCKF has the highest prediction accuracy with high stability on the three datasets, and the number of wavelengths selected is small and highly interpretative, indicating that WIC-WRCKF has a better predictive ability.
The wavelength selection method can significantly improve the model prediction accuracy, and the WIC architecture can effectively exploit the essence of the spectral data, which has great potential in the application of wavelength selection.
波长选择是光谱分析的关键步骤之一,在提高模型预测精度和计算效率方面发挥着不可替代的作用。高维光谱数据集包含大量不相关信息和冗余变量。然而,现阶段此类问题可通过现有的多种波长选择方法解决。然而,这些方法难以在强波长可解释性和预测精度之间实现平衡。因此,迫切需要一种能够达到平衡的新方法。
我们提出了一种基于波长重要性聚类(WIC)的波长选择新框架,该框架试图通过聚类算法在波长点与响应属性之间建立层次关系,从而进行组合和筛选以获得最优波长组合。本文基于WIC构建了一种新的波长选择方法(WIC-WRCKF),并选择了四种常用的波长选择方法与WIC-WRCKF进行比较。还在三个公开可用的数据集上进行了大量实验,即小麦、玉米和平板数据集。与其他方法相比,WIC-WRCKF在这三个数据集上具有最高的预测精度和高稳定性,且所选波长数量少且具有高度可解释性,表明WIC-WRCKF具有更好的预测能力。
该波长选择方法可显著提高模型预测精度,且WIC架构能够有效挖掘光谱数据的本质,在波长选择应用中具有巨大潜力。