Big Data Research Centre , Université Laval , Québec City G1 V 0A6 , Canada.
Centre de Recherche en Infectiologie de l'Université Laval, Axe Maladies Infectieuses et Immunitaires , Centre de Recherche du CHU de Québec-Université Laval , Québec City G1 V 4G2 , Canada.
Anal Chem. 2019 Apr 16;91(8):5191-5199. doi: 10.1021/acs.analchem.8b05821. Epub 2019 Apr 1.
Untargeted metabolomic measurements using mass spectrometry are a powerful tool for uncovering new small molecules with environmental and biological importance. The small molecule identification step, however, still remains an enormous challenge due to fragmentation difficulties or unspecific fragment ion information. Current methods to address this challenge are often dependent on databases or require the use of nuclear magnetic resonance (NMR), which have their own difficulties. The use of the gas-phase collision cross section (CCS) values obtained from ion mobility spectrometry (IMS) measurements were recently demonstrated to reduce the number of false positive metabolite identifications. While promising, the amount of empirical CCS information currently available is limited, thus predictive CCS methods need to be developed. In this article, we expand upon current experimental IMS capabilities by predicting the CCS values using a deep learning algorithm. We successfully developed and trained a prediction model for CCS values requiring only information about a compound's SMILES notation and ion type. The use of data from five different laboratories using different instruments allowed the algorithm to be trained and tested on more than 2400 molecules. The resulting CCS predictions were found to achieve a coefficient of determination of 0.97 and median relative error of 2.7% for a wide range of molecules. Furthermore, the method requires only a small amount of processing power to predict CCS values. Considering the performance, time, and resources necessary, as well as its applicability to a variety of molecules, this model was able to outperform all currently available CCS prediction algorithms.
使用质谱进行非靶向代谢组学测量是揭示具有环境和生物学重要性的新小分子的有力工具。然而,小分子鉴定步骤仍然是一个巨大的挑战,因为碎片困难或特异性碎片离子信息。当前解决此挑战的方法通常依赖于数据库或需要使用核磁共振(NMR),这两者都有自己的困难。最近证明,使用离子淌度谱(IMS)测量获得的气相碰撞截面(CCS)值可减少假阳性代谢物鉴定的数量。虽然很有前途,但目前可用的经验 CCS 信息的数量是有限的,因此需要开发预测 CCS 方法。在本文中,我们通过使用深度学习算法来预测 CCS 值来扩展当前的实验 IMS 能力。我们成功地开发并训练了一个仅需要化合物 SMILES 符号和离子类型信息的 CCS 值预测模型。使用来自五个不同实验室使用不同仪器的数据,该算法可以在超过 2400 种分子上进行训练和测试。结果表明,CCS 预测的决定系数达到 0.97,中位数相对误差为 2.7%,适用于广泛的分子。此外,该方法仅需要少量的处理能力来预测 CCS 值。考虑到性能、时间和所需资源,以及其对各种分子的适用性,该模型能够胜过所有当前可用的 CCS 预测算法。