Krebbers Roderik, Sluijterman Laurens A Æ, Meurs Joris, Khodabakhsh Amir, Cator Eric A, Cristescu Simona M
Life Science Trace Detection Laboratory, Department of Analytical Chemistry & Chemometrics, Institute for Molecules and Materials, Radboud University, Heyendaalseweg 135, 6525 AJ, Nijmegen, the Netherlands.
Department of Mathematics, Radboud University, Heyendaalseweg 135, 6525 AJ, Nijmegen, the Netherlands.
Anal Chim Acta. 2025 Sep 15;1367:344303. doi: 10.1016/j.aca.2025.344303. Epub 2025 Jun 6.
Broadband mid-infrared spectroscopy for gas sensing is a rapidly progressing field with a main frontier in novel mid-infrared ultra-broadband laser sources. These sources provide increasingly complex and high-resolution spectra that are useful for highly sensitive multispecies (trace) gas detection. However, laser-based sources also add a high level of instrument-specific noise and baseline drifts. To efficiently handle the complexity of data and simultaneously overcome the noise and baseline drift, a demand arises to assess and enhance the processing of the acquired spectra.
We present a simulation-based approach that improves the detection of gas compounds from broadband mid-infrared spectra. The central idea is to construct a realistic simulation environment in which the data processing of any model of choice can be improved. This can benefit both commonly used processing techniques, such as classical least squares (CLS) fitting, which can be fine-tuned, and statistical models such as partial least squares (PLS), which demand a (relatively) large realistic training set. In addition, we can estimate an instrument- and application-specific detection limit a priori. The simulated spectra are made by combining simulated absorbance spectra with measured (featureless) background intensity spectra, which is crucial to incorporate instrument-specific effects such as baseline drifts. The resulting hybrid dataset is scalable in size and complexity. The workflow was applied to real-life measurements in the 8-11 μm wavelength region to detect trace levels of acetone in CO- and water vapor-rich exhaled human breath samples. Both CLS and PLS models could be considerably improved by using our proposed approach.
The workflow presented here provides a means to optimize, train, and assess data-processing techniques for broadband mid-infrared gas spectra. The approach is widely applicable: it can be implemented for any gas absorption spectroscopic instrument with broadband coverage, as long as it is possible to determine the instrument-specific characteristics, and it can improve and evaluate a wide variety of data-processing techniques.
用于气体传感的宽带中红外光谱学是一个快速发展的领域,其主要前沿在于新型中红外超宽带激光源。这些光源提供了日益复杂和高分辨率的光谱,可用于高灵敏度多物种(痕量)气体检测。然而,基于激光的光源也会带来高水平的仪器特定噪声和基线漂移。为了有效处理数据的复杂性并同时克服噪声和基线漂移,人们需要评估和改进对采集光谱的处理。
我们提出了一种基于模拟的方法,可改进从宽带中红外光谱中检测气体化合物的能力。核心思想是构建一个逼真的模拟环境,在其中可以改进任何所选模型的数据处理。这对常用处理技术(如可进行微调的经典最小二乘法(CLS)拟合)和统计模型(如需要(相对)大量逼真训练集的偏最小二乘法(PLS))都有益处。此外,我们可以先验估计特定仪器和应用的检测限。通过将模拟吸收光谱与测量的(无特征)背景强度光谱相结合来生成模拟光谱,这对于纳入诸如基线漂移等特定仪器效应至关重要。所得的混合数据集在大小和复杂性方面具有可扩展性。该工作流程应用于8 - 11μm波长区域的实际测量,以检测富含一氧化碳和水蒸气的呼出人体呼吸样本中的痕量丙酮。使用我们提出的方法,CLS和PLS模型都可以得到显著改进。
本文介绍的工作流程提供了一种优化、训练和评估宽带中红外气体光谱数据处理技术的方法。该方法具有广泛的适用性:只要能够确定特定仪器的特性,就可以应用于任何具有宽带覆盖的气体吸收光谱仪器,并且可以改进和评估多种数据处理技术。