Department of Forensic and Investigative Science, West Virginia University, Morgantown, West Virginia 26506, United States.
C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States.
J Am Soc Mass Spectrom. 2023 Jul 5;34(7):1248-1262. doi: 10.1021/jasms.3c00089. Epub 2023 May 31.
This study aims to resolve one of the longest-standing problems in mass spectrometry, which is how to accurately identify an organic substance from its mass spectrum when a spectrum of the suspected substance has not been analyzed contemporaneously on the same instrument. Part one of this two-part report describes how Rice-Ramsperger-Kassel-Marcus (RRKM) theory predicts that many branching ratios in replicate electron-ionization mass spectra will provide approximately linear correlations when analysis conditions change within or between instruments. Here, proof-of-concept general linear modeling is based on the 20 most abundant fragments in a database of 128 training spectra of cocaine collected over 6 months in an operational crime laboratory. The statistical validity of the approach is confirmed through both analysis of variance (ANOVA) of the regression models and assessment of the distributions of the residuals of the models. General linear modeling models typically explain more than 90% of the variance in normalized abundances. When the linear models from the training set are applied to 175 additional known positive cocaine spectra from more than 20 different laboratories, the linear models enabled ion abundances to be predicted with an accuracy of <2% relative to the base peak, even though the measured abundances vary by more than 30%. The same models were also applied to 716 known negative spectra, including the diastereomers of cocaine: allococaine, pseudococaine, and pseudoallococaine, and the residual errors were larger for the known negatives than for known positives. The second part of the manuscript describes how general linear regression modeling can serve as the basis for binary classification and reliable identification of cocaine from its diastereomers and all other known negatives.
本研究旨在解决质谱学中一个由来已久的问题,即在没有同时在同一台仪器上分析可疑物质的质谱的情况下,如何从其质谱准确识别有机物质。本报告的两部分中的第一部分描述了 Rice-Ramsperger-Kassel-Marcus(RRKM)理论如何预测,当分析条件在仪器内或仪器之间发生变化时,重复的电子离子化质谱中的许多分支比将提供近似线性关系。在这里,基于可卡因数据库中 128 个训练谱在 6 个月内在一个运营犯罪实验室中收集的 20 个最丰富碎片的概念验证通用线性模型。该方法的统计有效性通过对回归模型的方差分析(ANOVA)和对模型残差分布的评估来确认。通用线性模型通常可以解释归一化丰度变化的 90%以上。当将训练集的线性模型应用于来自 20 多个不同实验室的 175 个额外已知阳性可卡因谱时,线性模型能够以相对于基峰<2%的精度预测离子丰度,即使测量的丰度变化超过 30%。相同的模型也应用于 716 个已知的阴性谱,包括可卡因的对映异构体:Allococaine、Pseudococaine 和 Pseudoallococaine,并且已知阴性的剩余误差大于已知阳性的误差。第二部分描述了通用线性回归建模如何作为基于二进制分类和从其对映异构体和所有其他已知阴性物质可靠识别可卡因的基础。