Bangalore A S, Shaffer R E, Small G W, Arnold M A
Department of Chemistry, Clippinger Laboratories, Ohio University, Athens 45701-2979, USA.
Anal Chem. 1996 Dec 1;68(23):4200-12. doi: 10.1021/ac9607121.
Genetic algorithms (GAs) are used to implement an automated wavelength selection procedure for use in building multivariate calibration models based on partial least-squares regression. The method also allows the number of latent variables used in constructing the calibration models to be optimized along with the selection of the wavelengths. The data used to test this methodology are derived from the determination of aqueous organic species by near-infrared spectroscopy. The three data sets employed focus on the determination of (1) methyl isobutyl ketone in water over the range of 1-160 ppm, (2) physiological levels of glucose in a phosphate buffer matrix containing bovine serum albumin and triacetin, and (3) glucose in a human serum matrix. These data sets feature analyte signals near the limit of detection and the presence of significant spectral interferences. Studies are performed to characterize the signal and noise characteristics of the spectral data, and optimal configurations for the GA are found for each data set through experimental design techniques. Despite the complexity of the spectral data, the GA procedure is found to perform well, leading to calibration models that significantly outperform those based on full spectrum analyses. In addition, a significant reduction in the number of spectral points required to build the models is realized.
遗传算法(GAs)用于实现一种自动波长选择程序,以用于构建基于偏最小二乘回归的多元校准模型。该方法还允许在选择波长的同时优化构建校准模型时使用的潜在变量数量。用于测试此方法的数据来自通过近红外光谱法测定水性有机物种。所采用的三个数据集重点在于测定:(1)水中1 - 160 ppm范围内的甲基异丁基酮;(2)含有牛血清白蛋白和三醋精的磷酸盐缓冲基质中生理水平的葡萄糖;(3)人血清基质中的葡萄糖。这些数据集的特征在于分析物信号接近检测限且存在显著的光谱干扰。进行了研究以表征光谱数据的信号和噪声特征,并通过实验设计技术为每个数据集找到了遗传算法的最佳配置。尽管光谱数据复杂,但发现遗传算法程序表现良好,所得到的校准模型显著优于基于全光谱分析的模型。此外,构建模型所需的光谱点数显著减少。