Department of Chemistry, Box 351700, University of Washington, Seattle, WA, 98195, USA.
Department of Chemistry, Box 351700, University of Washington, Seattle, WA, 98195, USA.
Anal Chim Acta. 2022 May 29;1209:339847. doi: 10.1016/j.aca.2022.339847. Epub 2022 Apr 19.
Tile-based variance rank initiated-unsupervised sample indexing (VRI-USI) analysis is introduced for comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry (GC×GC-TOFMS). VRI-USI analysis addresses the challenge that irrelevant variables can often obscure true chemical variation when using other unsupervised chemometric tools. Implementation of VRI-USI analysis with GC×GC-TOFMS data incorporates the tile-based Fisher ratio (F-ratio) analysis software platform that mitigates the effects of retention shifting in both separation dimensions with an unsupervised variance metric (instead of the F-ratio metric) as the initial step of ranking the hitlist. Next, implementation of k-means clustering, k, per hit using the silhouette metric, S, is used to reveal to what extent recurring indexed sample clusters are uncovered. Finally, based upon a probability-based evaluation of how the individual samples cluster throughout the hitlist an unsupervised class membership is revealed. For a JP8 jet fuel dataset spiked with a sulfur-containing analyte mix at 30-ppm, 15-ppm, and neat, clustering by spike level at k = 3 was the most commonly re-occurring set of index assignments, occurring for 11 out of 14 spiked analytes. Upon application of these k-means index assignments to the entire hitlist, all 14 spiked hits had one way ANOVA p-values < 0.05, validating the presumption of classes. Next, application of VRI-USI to a 3-ppm spiked and neat JP8 jet fuel comparison exhibited similar performance to F-ratio analysis for analyte discovery. In the last study, for a dataset of J1800A, JP4, and JP8 jet fuel, each spiked with the sulfur-containing analyte mix at 30-ppm and neat, 453 out of 520 hits in the hitlist exhibited index assignments indicative of fuel type clustering, with the remaining 67 hits having contradictory assignments. Scrutinization of these 67 hits revealed nine hits with "split combinations" in index assignments, whereby the spiked and neat samples for a given fuel were in separate clusters. Eight of these hits were identified as spiked sulfur analytes. Interestingly, these hits also had large S indicative of a true sub-cluster. Thus, tile-based VRI-USI analysis appears to be a promising tool for unsupervised multi-class classification studies using GC×GC-TOFMS data.
基于瓦片的方差秩初始化无监督样本索引(VRI-USI)分析被引入到全二维气相色谱飞行时间质谱(GC×GC-TOFMS)中。VRI-USI 分析解决了在使用其他无监督化学计量工具时,无关变量通常会掩盖真实化学变化的问题。在 GC×GC-TOFMS 数据中实施 VRI-USI 分析时,采用了基于瓦片的 Fisher 比(F-ratio)分析软件平台,该平台通过无监督方差度量(而不是 F-ratio 度量)来缓解两个分离维度中保留时间移动的影响,作为对 hitlist 进行排序的初始步骤。接下来,使用轮廓度量(S)对 k-means 聚类、k、每个命中进行实现,以揭示重复索引样本聚类的程度。最后,基于对个体样本在 hitlist 中聚类程度的概率评估,揭示了无监督的类别归属。对于 JP8 喷气燃料数据集,其中添加了 30-ppm、15-ppm 和纯的含硫分析物混合物,在 k = 3 时按尖峰水平聚类是最常见的重复索引分配集,14 个添加的分析物中有 11 个出现这种情况。将这些 k-means 索引分配应用于整个 hitlist 后,所有 14 个添加的命中的单因素方差分析 p 值均<0.05,验证了类别的假设。接下来,将 VRI-USI 应用于 3-ppm 添加和纯 JP8 喷气燃料的比较,对于分析物的发现,其性能与 F-ratio 分析相似。在最后一项研究中,对于 J1800A、JP4 和 JP8 喷气燃料的数据集,每个数据集均以 30-ppm 添加了含硫分析物混合物和纯品,在 hitlist 中,有 520 个命中中有 453 个命中显示出与燃料类型聚类相关的索引分配,其余 67 个命中具有相反的分配。对这 67 个命中进行仔细检查,发现有 9 个命中的索引分配具有“分裂组合”,即给定燃料的添加和纯品样本分别在不同的聚类中。这 9 个命中中有 8 个被鉴定为添加的硫分析物。有趣的是,这些命中的 S 值也很大,表明存在真正的子聚类。因此,基于瓦片的 VRI-USI 分析似乎是使用 GC×GC-TOFMS 数据进行无监督多类分类研究的一种很有前途的工具。