Wu Lina, Xiao Fu, Luo Xiaomin, Yun Keming, Wen Di, Lin Jiaman, Yang Shuo, Li Tianle, Xiang Ping, Shi Yan
Academy of Forensic Science, Shanghai Key Laboratory of Forensic Medicine, Shanghai 200063, PR China.
Shanxi Medical University, Jinzhong 030600, PR China.
Heliyon. 2023 May 25;9(6):e16671. doi: 10.1016/j.heliyon.2023.e16671. eCollection 2023 Jun.
Abuse of Synthetic Cannabinoids (SCs) has become a serious threat to public health. Due to the various structural and chemical group modified by criminals, their detection is a major challenge in forensic toxicological identification. Therefore, rapid and efficient identification of SCs is important for forensic toxicology and drug bans. The prediction of an analyte's retention time in liquid chromatography is an important index for the qualitative analysis of compounds and can provide informatics solutions for the interpretation of chromatographic data.
In this study, experimental data from high-resolution mass spectrometry (HRMS) are used to construct a regression model for predicting the retention time of SCs using machine learning methods. The prediction ability of the model is improved by adopting a strategy that combines different descriptors in different independent machine-learning methods.
The best model was obtained with a method that combined Substructure Fingerprint Count and Finger printer features and the support vector regression (SVR) method, as it exhibited an R value of 0.81 for the validation set and 0.83 for the test set. In addition, 4 new SCs were predicted by the optimized model, with a prediction error within 3%.
Our study provides a model that can predict the retention time of compounds and it can be used as a filter to reduce false-positive candidates when used in combination with LC-HRMS, especially in the absence of reference standards. This can improve the confidence of identification in non-targeted analysis and the reliability of identifying unknown substances.
合成大麻素(SCs)的滥用已成为对公众健康的严重威胁。由于犯罪分子对其进行了各种结构和化学基团修饰,其检测是法医毒物鉴定中的一项重大挑战。因此,快速高效地鉴定合成大麻素对法医毒理学和药物禁令至关重要。分析物在液相色谱中的保留时间预测是化合物定性分析的重要指标,可为色谱数据解释提供信息学解决方案。
在本研究中,利用高分辨率质谱(HRMS)的实验数据,采用机器学习方法构建预测合成大麻素保留时间的回归模型。通过在不同的独立机器学习方法中结合不同描述符的策略提高模型的预测能力。
采用结合子结构指纹计数和指纹图谱特征的方法以及支持向量回归(SVR)方法获得了最佳模型,其在验证集上的R值为0.81,在测试集上为0.83。此外,优化后的模型预测了4种新的合成大麻素,预测误差在3%以内。
我们的研究提供了一个可以预测化合物保留时间的模型,当与液相色谱-高分辨率质谱联用时,它可以作为一种筛选工具来减少假阳性候选物,特别是在没有参考标准品的情况下。这可以提高非靶向分析中鉴定的可信度以及鉴定未知物质的可靠性。