Research Group Analysis Techniques in the Life Sciences, Avans Hogeschool, University of Professional Education, P.O. Box 90116, 4800 RA Breda, The Netherlands.
Department of Analytical Chemistry and Pharmaceutical Technology, Center for Pharmaceutical Research, Vrije Universiteit Brussel-VUB, Laarbeeklaan 103, B-1090 Brussels, Belgium.
Anal Chim Acta. 2017 Aug 22;982:37-47. doi: 10.1016/j.aca.2017.06.001. Epub 2017 Jun 16.
The calibration performance of Partial Least Squares regression (PLS) can be improved by eliminating uninformative variables. For PLS, many variable elimination methods have been developed. One is the Uninformative-Variable Elimination for PLS (UVE-PLS). However, the number of variables retained by UVE-PLS is usually still large. In UVE-PLS, variable elimination is repeated as long as the root mean squared error of cross validation (RMSECV) is decreasing. The set of variables in this first local minimum is retained. In this paper, a modification of UVE-PLS is proposed and investigated, in which UVE is repeated until no further reduction in variables is possible, followed by a search for the global RMSECV minimum. The method is called Global-Minimum Error Uninformative-Variable Elimination for PLS, denoted as GME-UVE-PLS or simply GME-UVE. After each iteration, the predictive ability of the PLS model, built with the remaining variable set, is assessed by RMSECV. The variable set with the global RMSECV minimum is then finally selected. The goal is to obtain smaller sets of variables with similar or improved predictability than those from the classical UVE-PLS method. The performance of the GME-UVE-PLS method is investigated using four data sets, i.e. a simulated set, NIR and NMR spectra, and a theoretical molecular descriptors set, resulting in twelve profile-response (X-y) calibrations. The selective and predictive performances of the models resulting from GME-UVE-PLS are statistically compared to those from UVE-PLS and 1-step UVE, one-sided paired t-tests. The results demonstrate that variable reduction with the proposed GME-UVE-PLS method, usually eliminates significantly more variables than the classical UVE-PLS, while the predictive abilities of the resulting models are better. With GME-UVE-PLS, a lower number of uninformative variables, without a chemical meaning for the response, may be retained than with UVE-PLS. The selectivity of the classical UVE method thus can be improved by the application of the proposed GME-UVE method resulting in more parsimonious models.
偏最小二乘回归(PLS)的校准性能可以通过消除无信息变量来提高。对于 PLS,已经开发了许多变量消除方法。其中一种是偏最小二乘的无信息变量消除(UVE-PLS)。然而,UVE-PLS 保留的变量数量通常仍然很大。在 UVE-PLS 中,只要交叉验证均方根误差(RMSECV)减小,就会重复进行变量消除。保留第一个局部最小值的变量集。在本文中,提出并研究了 UVE-PLS 的一种改进方法,其中重复 UVE,直到无法进一步减少变量,然后搜索全局 RMSECV 最小值。该方法称为偏最小二乘的全局最小误差无信息变量消除(GME-UVE-PLS),简称 GME-UVE 或简称 GME-UVE。每次迭代后,都会使用 RMSECV 评估基于剩余变量集构建的 PLS 模型的预测能力。然后最终选择具有全局 RMSECV 最小值的变量集。目标是获得具有相似或改进的可预测性的较小变量集,而不是经典 UVE-PLS 方法的变量集。使用四个数据集(即模拟集、NIR 和 NMR 光谱以及理论分子描述符集)研究了 GME-UVE-PLS 方法的性能,从而产生了十二个轮廓响应(X-y)校准。使用单边配对 t 检验对来自 GME-UVE-PLS 的模型的选择性和预测性能与来自 UVE-PLS 和 1 步 UVE 的模型的选择性和预测性能进行了统计学比较。结果表明,与经典的 UVE-PLS 相比,使用所提出的 GME-UVE-PLS 方法进行变量减少通常会消除更多的变量,而得到的模型的预测能力更好。使用 GME-UVE-PLS,可以保留比 UVE-PLS 更少的无信息变量,这些变量对响应没有化学意义。通过应用所提出的 GME-UVE 方法,可以提高经典 UVE 方法的选择性,从而得到更简洁的模型。
Guang Pu Xue Yu Guang Pu Fen Xi. 2016-10
Spectrochim Acta A Mol Biomol Spectrosc. 2012-6-6
Spectrochim Acta A Mol Biomol Spectrosc. 2011-2-23
Guang Pu Xue Yu Guang Pu Fen Xi. 2013-9
Materials (Basel). 2023-5-11
Sensors (Basel). 2020-6-17