Suppr超能文献

基于近红外光谱的偏最小二乘回归中变量选择算法用于预测土壤中特定金属的比较

Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil.

作者信息

Abrantes Giovanna, Almeida Valber, Maia Angelo Jamil, Nascimento Rennan, Nascimento Clistenes, Silva Ygor, Silva Yuri, Veras Germano

机构信息

Departamento de Química, Centro de Ciência e Tecnologia, Universidade Estadual da Paraíba, Campina Grande 58429-500, Brazil.

Agronomy Department, Federal Rural University of Pernambuco, Recife 52171-900, Brazil.

出版信息

Molecules. 2023 Oct 6;28(19):6959. doi: 10.3390/molecules28196959.

Abstract

Soil is one of the Earth's most important natural resources. The presence of metals can decrease environmental quality if present in excessive amounts. Analyzing soil metal contents can be costly and time consuming, but near-infrared (NIR) spectroscopy coupled with chemometric tools can offer an alternative. The most important multivariate calibration method to predict concentrations or physical, chemical or physicochemical properties as a chemometric tool is partial least-squares (PLS) regression. However, a large number of irrelevant variables may cause problems of accuracy in the predictive chemometric models. Thus, stochastic variable-selection techniques, such as the Firefly algorithm by intervals in PLS (FFiPLS), can provide better solutions for specific problems. This study aimed to evaluate the performance of FFiPLS against deterministic PLS algorithms for the prediction of metals in river basin soils. The samples had their spectra collected from the region of 1000-2500 nm. Predictive models were then built from the spectral data, including PLS, interval-PLS (iPLS), successive projections algorithm for interval selection in PLS (iSPA-PLS), and FFiPLS. The chemometric models were built with raw data and preprocessed data by using different methods such as multiplicative scatter correction (MSC), standard normal variate (SNV), mean centering, adjustment of baseline and smoothing by the Savitzky-Golay method. The elliptical joint confidence region (EJCR) used in each chemometric model presented adequate fit. FFiPLS models of iron and titanium obtained a relative prediction deviation (RPD) of more than 2. The chemometric models for determination of aluminum obtained an RPD of more than 2 in the preprocessed data with SNV, MSC and baseline (offset + linear) and with raw data. The metals Be, Gd and Y failed to obtain adequate models in terms of residual prediction deviation (RPD). These results are associated with the low values of metals in the samples. Considering the complexity of the samples, the relative error of prediction (REP) obtained between 10 and 25% of the values adequate for this type of sample. Root mean square error of calibration and prediction (RMSEC and RMSEP, respectively) presented the same profile as the other quality parameters. The FFiPLS algorithm outperformed deterministic algorithms in the construction of models estimating the content of Al, Be, Gd and Y. This study produced chemometric models with variable selection able to determine metals in the Ipojuca River watershed soils using reflectance-mode NIR spectrometry.

摘要

土壤是地球上最重要的自然资源之一。如果金属含量过高,会降低环境质量。分析土壤金属含量成本高且耗时,但近红外(NIR)光谱结合化学计量学工具可提供一种替代方法。作为一种化学计量学工具,预测浓度或物理、化学或物理化学性质的最重要的多元校准方法是偏最小二乘(PLS)回归。然而,大量不相关变量可能会导致预测化学计量模型的准确性问题。因此,随机变量选择技术,如PLS区间萤火虫算法(FFiPLS),可以为特定问题提供更好的解决方案。本研究旨在评估FFiPLS相对于确定性PLS算法在预测流域土壤中金属含量方面的性能。从1000 - 2500nm区域收集样品的光谱。然后根据光谱数据建立预测模型,包括PLS、区间PLS(iPLS)、PLS区间选择的连续投影算法(iSPA - PLS)和FFiPLS。通过使用不同方法,如多元散射校正(MSC)、标准正态变量变换(SNV)、均值中心化、基线校正和Savitzky - Golay方法平滑,用原始数据和预处理数据建立化学计量模型。每个化学计量模型中使用的椭圆联合置信区域(EJCR)拟合良好。铁和钛的FFiPLS模型获得了大于2的相对预测偏差(RPD)。在使用SNV、MSC和基线(偏移 + 线性)的预处理数据以及原始数据中,测定铝的化学计量模型获得了大于2的RPD。金属铍、钆和钇在残差预测偏差(RPD)方面未能获得合适的模型。这些结果与样品中金属的低值有关。考虑到样品的复杂性,预测相对误差(REP)在适合此类样品的值的10%至25%之间。校准和预测的均方根误差(分别为RMSEC和RMSEP)与其他质量参数呈现相同的分布。在构建估计铝、铍、钆和钇含量的模型时,FFiPLS算法优于确定性算法。本研究利用反射模式近红外光谱法建立了具有变量选择功能的化学计量模型,能够测定伊波茹卡河流域土壤中的金属。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f937/10574190/5fded5f526e4/molecules-28-06959-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验