Stenlund Hans, Johansson Erik, Gottfries Johan, Trygg Johan
Computational Life Science Cluster (CLIC), Chemical Biology Center (KBC), Umeå University, SE-90187 Umeå, Sweden.
Anal Chem. 2009 Jan 1;81(1):203-9. doi: 10.1021/ac801803e.
Near infrared spectroscopy (NIR) was developed primarily for applications such as the quantitative determination of nutrients in the agricultural and food industries. Examples include the determination of water, protein, and fat within complex samples such as grain and milk. Because of its useful properties, NIR analysis has spread to other areas such as chemistry and pharmaceutical production. NIR spectra consist of infrared overtones and combinations thereof, making interpretation of the results complicated. It can be very difficult to assign peaks to known constituents in the sample. Thus, multivariate analysis (MVA) has been crucial in translating spectral data into information, mainly for predictive purposes. Orthogonal partial least squares (OPLS), a new MVA method, has prediction and modeling properties similar to those of other MVA techniques, e.g., partial least squares (PLS), a method with a long history of use for the analysis of NIR data. OPLS provides an intrinsic algorithmic improvement for the interpretation of NIR data. In this report, four sets of NIR data were analyzed to demonstrate the improved interpretation provided by OPLS. The first two sets included simulated data to demonstrate the overall principles; the third set comprised a statistically replicated design of experiments (DoE), to demonstrate how instrumental difference could be accurately visualized and correctly attributed to Wood's anomaly phenomena; the fourth set was chosen to challenge the MVA by using data relating to powder mixing, a crucial step in the pharmaceutical industry prior to tabletting. Improved interpretation by OPLS was demonstrated for all four examples, as compared to alternative MVA approaches. It is expected that OPLS will be used mostly in applications where improved interpretation is crucial; one such area is process analytical technology (PAT). PAT involves fewer independent samples, i.e., batches, than would be associated with agricultural applications; in addition, the Food and Drug Administration (FDA) demands "process understanding" in PAT. Both these issues make OPLS the ideal tool for a multitude of NIR calibrations. In conclusion, OPLS leads to better interpretation of spectrometry data (e.g., NIR) and improved understanding facilitates cross-scientific communication. Such improved knowledge will decrease risk, with respect to both accuracy and precision, when using NIR for PAT applications.
近红外光谱法(NIR)最初是为农业和食品工业中营养成分的定量测定等应用而开发的。例如,用于测定谷物和牛奶等复杂样品中的水分、蛋白质和脂肪。由于其有用的特性,近红外分析已扩展到化学和制药生产等其他领域。近红外光谱由红外泛音及其组合组成,这使得结果的解释变得复杂。很难将峰与样品中的已知成分对应起来。因此,多变量分析(MVA)对于将光谱数据转化为信息至关重要,主要用于预测目的。正交偏最小二乘法(OPLS)是一种新的多变量分析方法,具有与其他多变量分析技术(如偏最小二乘法(PLS),一种长期用于近红外数据分析的方法)相似的预测和建模特性。OPLS为近红外数据的解释提供了内在的算法改进。在本报告中,分析了四组近红外数据,以证明OPLS提供的改进解释。前两组包括模拟数据以展示总体原理;第三组包括实验设计(DoE)的统计重复设计,以展示如何准确可视化仪器差异并正确归因于伍德异常现象;第四组选择使用与粉末混合相关的数据来挑战多变量分析,粉末混合是制药工业中压片前的关键步骤。与其他多变量分析方法相比,OPLS对所有四个例子都展示了改进的解释。预计OPLS将主要用于改进解释至关重要的应用中;其中一个领域是过程分析技术(PAT)。与农业应用相比,PAT涉及的独立样本(即批次)较少;此外,美国食品药品监督管理局(FDA)要求在PAT中实现“过程理解”。这两个问题使OPLS成为众多近红外校准的理想工具。总之,OPLS能更好地解释光谱数据(如近红外),而更好的理解有助于跨学科交流。当将近红外用于PAT应用时,这种知识的改进将在准确性和精密度方面降低风险。