Department of Chemistry, Box 351700, University of Washington, Seattle, WA 98195, USA.
Department of Chemistry, Box 351700, University of Washington, Seattle, WA 98195, USA.
J Chromatogr A. 2024 Sep 13;1732:465220. doi: 10.1016/j.chroma.2024.465220. Epub 2024 Jul 31.
Partial least squares (PLS) regression is a valuable chemometric tool for property prediction when coupled with gas chromatography (GC). Since the separation run time and stationary phase selection are crucial for effective PLS modeling, we study these GC parameters on the prediction of viscosity, density and hydrogen content for 50 aerospace fuels. Due to the diversity of compounds in the fuels (primarily alkanes, cycloalkanes, and aromatics), we explore both polar and non-polar stationary phase columns. The robustness for the PLS models was evaluated by their normalized root mean square error of cross-validation (NRMSECV). PLS models built for viscosity across 1-min, 3-min, 7-min, and 10-min time window (TW) high-speed GC separations produced nearly the same NRMSECV with the polar column data with an average (standard deviation) of 4.41 % (0.34 %) versus the non-polar column data of 4.69 % (0.15 %). In contrast, while the NRMSECV of density modeling with the polar column data varied more than the viscosity models, averaging 7.54 % (0.67 %), the non-polar column data produced a significantly higher average NRMSECV of 10.06 % (0.35 %). Similarly, for hydrogen content, the NRMSECV with the polar column data averaged 9.50 % (0.87 %), which was significantly lower than the NRMSECV with the non-polar column data averaging 12.10 % (0.88 %). We also investigated the impact of smoothing the GC data on the corresponding PLS models. By applying varying degrees of smoothing, we can effectively obtain similar chromatographic peak patterns in a shorter TW. For example, a 10-min smoothed chromatogram appears like the 1-min separation with no smoothing but resulted in nearly the same NRMSECV. Overall, the fast separation with a 1-min TW produced robust PLS models for viscosity with either stationary phase column, whereas for density and hydrogen content the polar stationary phase column produced superior PLS models, thus with proper stationary phase selection, a fast separation run time could be readily applied with optimal PLS property modeling results.
偏最小二乘法(PLS)回归是一种有价值的化学计量学工具,可与气相色谱(GC)结合使用,用于预测性质。由于分离运行时间和固定相选择对于有效的 PLS 建模至关重要,因此我们针对 50 种航空航天燃料的粘度、密度和氢含量预测研究了这些 GC 参数。由于燃料中的化合物种类繁多(主要是烷烃、环烷烃和芳烃),我们探索了极性和非极性固定相柱。通过归一化交叉验证均方根误差(NRMSECV)评估 PLS 模型的稳健性。对于在 1 分钟、3 分钟、7 分钟和 10 分钟时间窗口(TW)的高速 GC 分离中构建的粘度 PLS 模型,使用极性柱数据时的 NRMSECV 几乎相同,平均值(标准差)为 4.41%(0.34%),而非极性柱数据为 4.69%(0.15%)。相比之下,尽管使用极性柱数据时密度建模的 NRMSECV 变化较大,平均值为 7.54%(0.67%),但非极性柱数据的 NRMSECV 平均值明显更高,为 10.06%(0.35%)。同样,对于氢含量,使用极性柱数据的 NRMSECV 平均值为 9.50%(0.87%),显著低于非极性柱数据的 NRMSECV 平均值 12.10%(0.88%)。我们还研究了对 GC 数据进行平滑处理对相应 PLS 模型的影响。通过应用不同程度的平滑处理,我们可以在更短的 TW 中有效地获得相似的色谱峰模式。例如,10 分钟平滑色谱图类似于无平滑处理的 1 分钟分离,但得到的 NRMSECV 几乎相同。总体而言,使用 1 分钟 TW 的快速分离对于两种固定相柱都产生了用于粘度的稳健 PLS 模型,而对于密度和氢含量,极性固定相柱产生了更好的 PLS 模型,因此,通过适当的固定相选择,可以快速应用分离运行时间,并获得最佳的 PLS 属性建模结果。