Suppr超能文献

利用激光诱导荧光、实验设计和遗传算法进行特征选择,寻求钐(III)、铕(III)和氯化锂的终极回归模型

Pursuit of the Ultimate Regression Model for Samarium(III), Europium(III), and LiCl Using Laser-Induced Fluorescence, Design of Experiments, and a Genetic Algorithm for Feature Selection.

作者信息

Andrews Hunter B, Sadergaski Luke R, Cary Samantha K

机构信息

Radioisotope Science and Technology Division, Oak Ridge National Laboratory, 1 Bethel Valley Rd., Oak Ridge, Tennessee37830, United States.

出版信息

ACS Omega. 2023 Jan 3;8(2):2281-2290. doi: 10.1021/acsomega.2c06610. eCollection 2023 Jan 17.

Abstract

Laser-induced fluorescence spectroscopy, Raman scattering, and partial least squares regression models were optimized for the quantification of samarium (0-150 μg mL), europium (0-75 μg mL), and lithium chloride (0.1-12 M) with a transformational preprocessing strategy. Selecting combinations of preprocessing methods to optimize the prediction performance of regression models is frequently a major bottleneck for chemometric analysis. Here, we propose an optimization tool using an innovative combination of optimal experimental designs for selecting preprocessing transformation and a genetic algorithm (GA) for feature selection. A D-optimal design containing 26 samples (i.e., combinations of preprocessing strategies) and a user-defined design (576 samples) did not statistically lower the root mean square error of the prediction (RMSEP). The greatest improvement in prediction performance was achieved when a GA was used for feature selection. This feature selection greatly lowered RMSEP statistics by an average of 53%, resulting in the top models with percent RMSEP values of 0.91, 3.5, and 2.1% for Sm(III), Eu(III), and LiCl, respectively. These results indicate that preprocessing corrections (e.g., scatter, scaling, noise, and baseline) alone cannot realize the optimal regression model; feature selection is a more crucial aspect to consider. This unique approach provides a powerful tool for approaching the true optimum prediction performance and can be applied to numerous fields of spectroscopy and chemometrics to rapidly construct models.

摘要

采用一种变换预处理策略,对激光诱导荧光光谱法、拉曼散射法和偏最小二乘回归模型进行了优化,以定量测定钐(0 - 150μg/mL)、铕(0 - 75μg/mL)和氯化锂(0.1 - 12M)。选择预处理方法的组合以优化回归模型的预测性能通常是化学计量学分析的一个主要瓶颈。在此,我们提出一种优化工具,它使用最优实验设计的创新组合来选择预处理变换,并使用遗传算法(GA)进行特征选择。包含26个样本(即预处理策略的组合)的D - 最优设计和用户定义设计(576个样本)在统计学上并没有降低预测均方根误差(RMSEP)。当使用GA进行特征选择时,预测性能得到了最大程度的改善。这种特征选择极大地降低了RMSEP统计量,平均降低了53%,使得对于Sm(III)、Eu(III)和LiCl的顶级模型的RMSEP百分比值分别为0.91%、3.5%和2.1%。这些结果表明,仅靠预处理校正(例如散射、缩放、噪声和基线校正)无法实现最优回归模型;特征选择是更关键的需要考虑的方面。这种独特的方法为实现真正的最优预测性能提供了一个强大的工具,并且可以应用于光谱学和化学计量学的众多领域以快速构建模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00e9/9850777/69087ed55984/ao2c06610_0002.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验