Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi, China.
Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi, China.
Anal Chim Acta. 2021 May 22;1160:338453. doi: 10.1016/j.aca.2021.338453. Epub 2021 Mar 28.
Quantitative analysis of the physical or chemical properties of various materials by using spectral analysis technology combined with chemometrics has become an important method in the field of analytical chemistry. This method aims to build a model relationship (called prediction model) between feature variables acquired by spectral sensors and components to be measured. Feature selection or transformation should be conducted to reduce the interference of irrelevant information on the prediction model because original spectral feature variables contain redundant information and massive noise. Most existing feature selection and transformation methods are single linear or nonlinear operations, which easily lead to the loss of feature information and affect the accuracy of subsequent prediction models. This research proposes a novel spectroscopic technology-oriented, quantitative analysis model construction strategy named M3GPSpectra. This tool uses genetic programming algorithm to select and reconstruct the original feature variables, evaluates the performance of selected and reconstructed variables by using multivariate regression model (MLR), and obtains the best feature combination and the final parameters of MLR through iterative learning. M3GPSpectra integrates feature selection, linear/nonlinear feature transformation, and subsequent model construction into a unified framework and thus easily realizes end-to-end parameter learning to significantly improve the accuracy of the prediction model. When applied to six types of datasets, M3GPSpectra obtains 19 prediction models, which are compared with those obtained by seven linear or non-linear popular methods. Experimental results show that M3GPSpectra obtains the best performance among the eight methods tested. Further investigation verifies that the proposed method is not sensitive to the size of the training samples. Hence, M3GPSpectra is a promising spectral quantitative analytical tool.
利用光谱分析技术结合化学计量学对各种材料的物理或化学性质进行定量分析,已成为分析化学领域的重要方法。该方法旨在建立光谱传感器获取的特征变量与待测量之间的模型关系(称为预测模型)。为了减少无关信息对预测模型的干扰,应该进行特征选择或变换,因为原始光谱特征变量包含冗余信息和大量噪声。大多数现有的特征选择和变换方法都是单一的线性或非线性操作,这容易导致特征信息的丢失,并影响后续预测模型的准确性。本研究提出了一种新的面向光谱技术的定量分析模型构建策略,称为 M3GPSpectra。该工具使用遗传编程算法选择和重构原始特征变量,通过多元回归模型(MLR)评估选择和重构变量的性能,并通过迭代学习获得最佳特征组合和 MLR 的最终参数。M3GPSpectra 将特征选择、线性/非线性特征变换和后续模型构建集成到一个统一的框架中,从而可以轻松实现端到端参数学习,显著提高预测模型的准确性。当应用于六种类型的数据集时,M3GPSpectra 获得了 19 个预测模型,并与七种线性或非线性流行方法获得的模型进行了比较。实验结果表明,在所测试的八种方法中,M3GPSpectra 具有最佳性能。进一步的研究验证了该方法对训练样本大小不敏感。因此,M3GPSpectra 是一种很有前途的光谱定量分析工具。