Fu Guang-Hui, Zong Min-Jie, Wang Feng-Hua, Yi Lun-Zhao
School of Science, Kunming University of Science and Technology, Kunming 650500, China.
Faculty of Agriculture and Food, Kunming University of Science and Technology, Kunming, Yunnan 650500, China.
Int J Anal Chem. 2019 Aug 1;2019:7314916. doi: 10.1155/2019/7314916. eCollection 2019.
Elastic net (Enet) and sparse partial least squares (SPLS) are frequently employed for wavelength selection and model calibration in analysis of near infrared spectroscopy data. Enet and SPLS can perform variable selection and model calibration simultaneously. And they also tend to select wavelength intervals rather than individual wavelengths when the predictors are multicollinear. In this paper, we focus on comparison of Enet and SPLS in interval wavelength selection and model calibration for near infrared spectroscopy data. The results from both simulation and real spectroscopy data show that Enet method tends to select less predictors as key variables than SPLS; thus it gets more parsimony model and brings advantages for model interpretation. SPLS can obtain much lower mean square of prediction error (MSE) than Enet. So SPLS is more suitable when the attention is to get better model fitting accuracy. The above conclusion is still held when coming to performing the strongly correlated NIR spectroscopy data whose predictors present group structures, Enet exhibits more sparse property than SPLS, and the selected predictors (wavelengths) are segmentally successive.
弹性网络(Enet)和稀疏偏最小二乘法(SPLS)在近红外光谱数据分析的波长选择和模型校准中经常被使用。Enet和SPLS可以同时进行变量选择和模型校准。并且当预测变量存在多重共线性时,它们倾向于选择波长区间而不是单个波长。在本文中,我们专注于比较Enet和SPLS在近红外光谱数据的区间波长选择和模型校准方面的表现。模拟数据和实际光谱数据的结果均表明,与SPLS相比,Enet方法倾向于选择更少的预测变量作为关键变量;因此它能得到更简洁的模型,并为模型解释带来优势。SPLS可以获得比Enet低得多的预测误差均方(MSE)。所以当关注的是获得更好的模型拟合精度时,SPLS更合适。当处理预测变量呈现组结构的强相关近红外光谱数据时,上述结论仍然成立,Enet比SPLS表现出更稀疏的特性,并且所选的预测变量(波长)是分段连续的。