Mollerup Christian Brinch, Mardal Marie, Dalsgaard Petur Weihe, Linnet Kristian, Barron Leon Patrick
Section of Forensic Chemistry, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Frederik V's vej 11, 3rd floor, Copenhagen DK-2100, Denmark.
Section of Forensic Chemistry, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Frederik V's vej 11, 3rd floor, Copenhagen DK-2100, Denmark.
J Chromatogr A. 2018 Mar 23;1542:82-88. doi: 10.1016/j.chroma.2018.02.025. Epub 2018 Feb 15.
Exact mass, retention time (RT), and collision cross section (CCS) are used as identification parameters in liquid chromatography coupled to ion mobility high resolution accurate mass spectrometry (LC-IM-HRMS). Targeted screening analyses are now more flexible and can be expanded for suspect and non-targeted screening. These allow for tentative identification of new compounds, and in-silico predicted reference values are used for improving confidence and filtering false-positive identifications. In this work, predictions of both RT and CCS values are performed with machine learning using artificial neural networks (ANNs). Prediction was based on molecular descriptors, 827 RTs, and 357 CCS values from pharmaceuticals, drugs of abuse, and their metabolites. ANN models for the prediction of RT or CCS separately were examined, and the potential to predict both from a single model was investigated for the first time. The optimized combined RT-CCS model was a four-layered multi-layer perceptron ANN, and the 95th prediction error percentiles were within 2 min RT error and 5% relative CCS error for the external validation set (n = 36) and the full RT-CCS dataset (n = 357). 88.6% (n = 733) of predicted RTs were within 2 min error for the full dataset. Overall, when using 2 min RT error and 5% relative CCS error, 91.9% (n = 328) of compounds were retained, while 99.4% (n = 355) were retained when using at least one of these thresholds. This combined prediction approach can therefore be useful for rapid suspect/non-targeted screening involving HRMS, and will support current workflows.
精确质量、保留时间(RT)和碰撞截面积(CCS)被用作液相色谱-离子淌度高分辨精确质谱(LC-IM-HRMS)中的鉴定参数。靶向筛查分析现在更加灵活,并且可以扩展到可疑物和非靶向筛查。这些方法允许对新化合物进行初步鉴定,并且使用计算机预测的参考值来提高可信度并过滤假阳性鉴定结果。在这项工作中,使用人工神经网络(ANN)通过机器学习对RT和CCS值进行预测。预测基于分子描述符、来自药品、滥用药物及其代谢物的827个RT值和357个CCS值。分别检查了用于预测RT或CCS的ANN模型,并首次研究了从单个模型预测两者的潜力。优化后的RT-CCS组合模型是一个四层多层感知器ANN,对于外部验证集(n = 36)和完整的RT-CCS数据集(n = 357),第95百分位预测误差在2分钟RT误差和5%相对CCS误差范围内。对于完整数据集,88.6%(n = 733)的预测RT误差在2分钟以内。总体而言,当使用2分钟RT误差和5%相对CCS误差时,91.9%(n = 328)的化合物被保留,而当使用这些阈值中的至少一个时,99.4%(n = 355)的化合物被保留。因此,这种组合预测方法可用于涉及HRMS的快速可疑物/非靶向筛查,并将支持当前的工作流程。