Wang Xin, Li Yan, Wei Haoyun, Chen Xia
1 College of Mechanical Engineering and Applied Electronics Technology, Beijing University of Technology, Beijing, China.
2 State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Tsinghua University, Beijing, China.
Appl Spectrosc. 2017 Jun;71(6):1231-1241. doi: 10.1177/0003702816675362. Epub 2016 Oct 26.
Classical least squares (CLS) regression is a popular multivariate statistical method used frequently for quantitative analysis using Fourier transform infrared (FT-IR) spectrometry. Classical least squares provides the best unbiased estimator for uncorrelated residual errors with zero mean and equal variance. However, the noise in FT-IR spectra, which accounts for a large portion of the residual errors, is heteroscedastic. Thus, if this noise with zero mean dominates in the residual errors, the weighted least squares (WLS) regression method described in this paper is a better estimator than CLS. However, if bias errors, such as the residual baseline error, are significant, WLS may perform worse than CLS. In this paper, we compare the effect of noise and bias error in using CLS and WLS in quantitative analysis. Results indicated that for wavenumbers with low absorbance, the bias error significantly affected the error, such that the performance of CLS is better than that of WLS. However, for wavenumbers with high absorbance, the noise significantly affected the error, and WLS proves to be better than CLS. Thus, we propose a selective weighted least squares (SWLS) regression that processes data with different wavenumbers using either CLS or WLS based on a selection criterion, i.e., lower or higher than an absorbance threshold. The effects of various factors on the optimal threshold value (OTV) for SWLS have been studied through numerical simulations. These studies reported that: (1) the concentration and the analyte type had minimal effect on OTV; and (2) the major factor that influences OTV is the ratio between the bias error and the standard deviation of the noise. The last part of this paper is dedicated to quantitative analysis of methane gas spectra, and methane/toluene mixtures gas spectra as measured using FT-IR spectrometry and CLS, WLS, and SWLS. The standard error of prediction (SEP), bias of prediction (bias), and the residual sum of squares of the errors (RSS) from the three quantitative analyses were compared. In methane gas analysis, SWLS yielded the lowest SEP and RSS among the three methods. In methane/toluene mixture gas analysis, a modification of the SWLS has been presented to tackle the bias error from other components. The SWLS without modification presents the lowest SEP in all cases but not bias and RSS. The modification of SWLS reduced the bias, which showed a lower RSS than CLS, especially for small components.
经典最小二乘法(CLS)回归是一种常用的多元统计方法,常用于利用傅里叶变换红外(FT - IR)光谱法进行定量分析。经典最小二乘法为均值为零且方差相等的不相关残差误差提供了最佳无偏估计量。然而,FT - IR光谱中的噪声占残差误差的很大一部分,且具有异方差性。因此,如果均值为零的这种噪声在残差误差中占主导,本文所述的加权最小二乘法(WLS)回归方法比CLS是更好的估计量。但是,如果偏差误差(如残留基线误差)显著,WLS的表现可能比CLS更差。在本文中,我们比较了在定量分析中使用CLS和WLS时噪声和偏差误差的影响。结果表明,对于吸光度较低的波数,偏差误差对误差有显著影响,使得CLS的性能优于WLS。然而,对于吸光度较高的波数,噪声对误差有显著影响,且WLS被证明优于CLS。因此,我们提出了一种选择性加权最小二乘法(SWLS)回归,它根据一个选择标准(即高于或低于吸光度阈值),使用CLS或WLS处理不同波数的数据。通过数值模拟研究了各种因素对SWLS的最佳阈值(OTV)的影响。这些研究报告称:(1)浓度和分析物类型对OTV的影响最小;(2)影响OTV的主要因素是偏差误差与噪声标准差的比值。本文的最后一部分致力于对甲烷气体光谱以及使用FT - IR光谱法和CLS、WLS和SWLS测量的甲烷/甲苯混合气体光谱进行定量分析。比较了三种定量分析的预测标准误差(SEP)、预测偏差(bias)和误差平方和(RSS)。在甲烷气体分析中,SWLS在三种方法中产生了最低的SEP和RSS。在甲烷/甲苯混合气体分析中,提出了一种SWLS的改进方法来解决来自其他成分的偏差误差。未改进的SWLS在所有情况下都呈现出最低的SEP,但不是偏差和RSS。SWLS的改进降低了偏差,其RSS比CLS更低,特别是对于小成分。