Suppr超能文献

一种评估多元回归校正中光谱预处理的图形方法:示例包括 Savitzky-Golay 滤波器和偏最小二乘回归。

A graphical method to evaluate spectral preprocessing in multivariate regression calibrations: example with Savitzky-Golay filters and partial least squares regression.

机构信息

USDA/ARS, Beltsville Agricultural Research Center, Food Quality Laboratory, Building 303, BARC-East, Beltsville, Maryland 20705-2350, USA. stephen.delwiche@ ars.usda.gov

出版信息

Appl Spectrosc. 2010 Jan;64(1):73-82. doi: 10.1366/000370210790572007.

Abstract

In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly smoothing operations or derivatives. While such operations are often useful in reducing the number of latent variables of the actual decomposition and lowering residual error, they also run the risk of misleading the practitioner into accepting calibration equations that are poorly adapted to samples outside of the calibration. The current study developed a graphical method to examine this effect on partial least squares (PLS) regression calibrations of near-infrared (NIR) reflection spectra of ground wheat meal with two analytes, protein content and sodium dodecyl sulfate sedimentation (SDS) volume (an indicator of the quantity of the gluten proteins that contribute to strong doughs). These two properties were chosen because of their differing abilities to be modeled by NIR spectroscopy: excellent for protein content, fair for SDS sedimentation volume. To further demonstrate the potential pitfalls of preprocessing, an artificial component, a randomly generated value, was included in PLS regression trials. Savitzky-Golay (digital filter) smoothing, first-derivative, and second-derivative preprocess functions (5 to 25 centrally symmetric convolution points, derived from quadratic polynomials) were applied to PLS calibrations of 1 to 15 factors. The results demonstrated the danger of an over reliance on preprocessing when (1) the number of samples used in a multivariate calibration is low (<50), (2) the spectral response of the analyte is weak, and (3) the goodness of the calibration is based on the coefficient of determination (R(2)) rather than a term based on residual error. The graphical method has application to the evaluation of other preprocess functions and various types of spectroscopy data.

摘要

在光谱数据分析的多元回归分析中,通常进行光谱预处理以减少不需要的背景信息(偏移量、倾斜基线)或突出固有重叠带中的吸收特征。这些过程也称为预处理,通常是平滑操作或导数。虽然这些操作通常有助于减少实际分解的潜在变量数量并降低残差误差,但它们也存在误导从业者接受校准方程的风险,这些方程不适用于校准之外的样品。本研究开发了一种图形方法,以检查近红外(NIR)反射光谱中两种分析物(蛋白质含量和十二烷基硫酸钠沉淀(SDS)体积(指示对形成强面团的面筋蛋白的数量)的地面小麦粉的偏最小二乘(PLS)回归校准中这种效果。选择这两种性质是因为它们具有不同的通过 NIR 光谱建模的能力:对于蛋白质含量非常出色,对于 SDS 沉淀体积则相当不错。为了进一步证明预处理的潜在陷阱,在 PLS 回归试验中包括了一个人工成分,即随机生成的值。Savitzky-Golay(数字滤波器)平滑、一阶导数和二阶导数预处理函数(5 到 25 个中心对称卷积点,由二次多项式导出)应用于 1 到 15 个因素的 PLS 校准。结果表明,当(1)多元校准中使用的样本数量低(<50)时,(2)分析物的光谱响应较弱,以及(3)校准的好坏基于决定系数(R²)而不是基于残差的术语时,过度依赖预处理存在危险。该图形方法适用于评估其他预处理函数和各种类型的光谱数据。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验