Department of Chemistry and Chemical engineering, Yibin University, Yibin, Sichuan 644007, PR China.
Spectrochim Acta A Mol Biomol Spectrosc. 2010 Dec;77(5):960-4. doi: 10.1016/j.saa.2010.08.031. Epub 2010 Aug 27.
Based on the combination of uninformative variable elimination (UVE), bootstrap and mutual information (MI), a simple ensemble algorithm, named ESPLS, is proposed for spectral multivariate calibration (MVC). In ESPLS, those uninformative variables are first removed; and then a preparatory training set is produced by bootstrap, on which a MI spectrum of retained variables is calculated. The variables that exhibit higher MI than a defined threshold form a subspace on which a candidate partial least-squares (PLS) model is constructed. This process is repeated. After a number of candidate models are obtained, a small part of models is picked out to construct an ensemble model by simple/weighted average. Four near/mid-infrared (NIR/MIR) spectral datasets concerning the determination of six components are used to verify the proposed ESPLS. The results indicate that ESPLS is superior to UVEPLS and its combination with MI-based variable selection (SPLS) in terms of both the accuracy and robustness. Besides, from the perspective of end-users, ESPLS does not increase the complexity of a calibration when enhancing its performance.
基于无信息变量消除 (UVE)、引导和互信息 (MI) 的组合,提出了一种简单的集成算法,称为 ESPLS,用于光谱多元校准 (MVC)。在 ESPLS 中,首先去除那些无信息的变量;然后通过引导生成一个预备训练集,在该训练集上计算保留变量的 MI 谱。表现出比定义的阈值更高的 MI 的变量形成一个子空间,在此子空间上构建候选偏最小二乘 (PLS) 模型。此过程重复进行。获得多个候选模型后,选择一小部分模型通过简单/加权平均来构建集成模型。使用四个近/中红外 (NIR/MIR) 光谱数据集来验证所提出的 ESPLS,这些数据集涉及到六个成分的测定。结果表明,与 UVEPLS 及其与基于 MI 的变量选择 (SPLS) 的组合相比,ESPLS 在准确性和稳健性方面都具有优势。此外,从最终用户的角度来看,在提高性能的同时,ESPLS 不会增加校准的复杂性。