An Xin, Xu Shuo, Zhang Lu-Da, Su Shi-Guang
College of Science, China Agricultural University, Beijing 100094, China.
Guang Pu Xue Yu Guang Pu Fen Xi. 2009 Jan;29(1):127-30.
In the present paper, on the basis of LS-SVM algorithm, we built a multiple dependent variables LS-SVM (MLS-SVM) regression model whose weights can be optimized, and gave the corresponding algorithm. Furthermore, we theoretically explained the relationship between MLS-SVM and LS-SVM. Sixty four broomcorn samples were taken as experimental material, and the sample ratio of modeling set to predicting set was 51 : 13. We first selected randomly and uniformly five weight groups in the interval [0, 1], and then in the way of leave-one-out (LOO) rule determined one appropriate weight group and parameters including penalizing parameters and kernel parameters in the model according to the criterion of the minimum of average relative error. Then a multiple dependent variables quantitative analysis model was built with NIR spectrum and simultaneously analyzed three chemical constituents containing protein, lysine and starch. Finally, the average relative errors between actual values and predicted ones by the model of three components for the predicting set were 1.65%, 6.47% and 1.37%, respectively, and the correlation coefficients were 0.9940, 0.8392 and 0.8825, respectively. For comparison, LS-SVM was also utilized, for which the average relative errors were 1.68%, 6.25% and 1.47%, respectively, and the correlation coefficients were 0.9941, 0.8310 and 0.8800, respectively. It is obvious that MLS-SVM algorithm is comparable to LS-SVM algorithm in modeling analysis performance, and both of them can give satisfying results. The result shows that the model with MLS-SVM algorithm is capable of doing multi-components NIR quantitative analysis synchronously. Thus MLS-SVM algorithm offers a new multiple dependent variables quantitative analysis approach for chemometrics. In addition, the weights have certain effect on the prediction performance of the model with MLS-SVM, which is consistent with our intuition and is validated in this study. Therefore, it is necessary to optimize weights in multiple dependent variables NIR modeling analysis.
在本文中,基于最小二乘支持向量机(LS - SVM)算法,我们构建了一个权重可优化的多因变量LS - SVM(MLS - SVM)回归模型,并给出了相应算法。此外,我们从理论上解释了MLS - SVM与LS - SVM之间的关系。以64个黍样本为实验材料,建模集与预测集的样本比例为51∶13。我们首先在区间[0, 1]内随机均匀地选取5个权重组,然后按照留一法(LOO)规则,根据平均相对误差最小的准则确定一个合适的权重组以及模型中的惩罚参数和核参数等参数。接着利用近红外光谱建立了多因变量定量分析模型,并同时分析了包含蛋白质、赖氨酸和淀粉的三种化学成分。最后,预测集三种成分的模型预测值与实际值之间的平均相对误差分别为1.65%、6.47%和1.37%,相关系数分别为0.9940、0.8392和0.8825。作为对比,也使用了LS - SVM,其平均相对误差分别为1.68%、6.25%和1.47%,相关系数分别为0.9941、0.8310和0.8800。显然,MLS - SVM算法在建模分析性能上与LS - SVM算法相当,二者都能给出令人满意的结果。结果表明,采用MLS - SVM算法的模型能够同步进行多成分近红外定量分析。因此,MLS - SVM算法为化学计量学提供了一种新的多因变量定量分析方法。此外,权重对采用MLS - SVM的模型预测性能有一定影响,这与我们的直觉相符且在本研究中得到了验证。所以,在多因变量近红外建模分析中优化权重是必要的。