Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai 200032, China.
University of Chinese Academy of Sciences, Beijing 100049, China.
Anal Chem. 2023 Sep 19;95(37):13913-13921. doi: 10.1021/acs.analchem.3c02267. Epub 2023 Sep 4.
The development of ion mobility-mass spectrometry (IM-MS) has revolutionized the analysis of small molecules, such as metabolomics, lipidomics, and exposome studies. The curation of comprehensive reference collision cross-section (CCS) databases plays a pivotal role in the successful application of IM-MS for small-molecule analysis. In this study, we presented AllCCS2, an enhanced version of AllCCS, designed for the universal prediction of the ion mobility CCS values of small molecules. AllCCS2 incorporated newly available experimental CCS data, including 10,384 records and 7713 unified values, as training data. By leveraging a neural network trained on diverse molecular representations encompassing mass spectrometry features, molecular descriptors, and graph features extracted using a graph convolutional network, AllCCS2 achieved exceptional prediction accuracy. AllCCS2 achieved median relative error (MedRE) values of 0.31, 0.72, and 1.64% in the training, validation, and testing sets, respectively, surpassing existing CCS prediction tools in terms of accuracy and coverage. Furthermore, AllCCS2 exhibited excellent compatibility with different instrument platforms (DTIMS, TWIMS, and TIMS). The prediction uncertainties in AllCCS2 from the training data and the prediction model were comprehensively investigated by using representative structure similarity and model prediction variation. Notably, small molecules with high structural similarities to the training set and lower model prediction variation exhibited improved accuracy and lower relative errors. In summary, AllCCS2 serves as a valuable resource to support applications of IM-MS technologies. The AllCCS2 database and tools are freely accessible at http://allccs.zhulab.cn/.
离子淌度-质谱(IM-MS)的发展彻底改变了小分子的分析,如代谢组学、脂质组学和暴露组学研究。全面的参考碰撞截面(CCS)数据库的编纂在成功应用 IM-MS 进行小分子分析方面起着关键作用。在这项研究中,我们提出了 AllCCS2,这是 AllCCS 的增强版本,旨在普遍预测小分子的离子淌度 CCS 值。AllCCS2 整合了新的可用实验 CCS 数据,包括 10384 条记录和 7713 个统一值作为训练数据。通过利用一个基于包含质谱特征、分子描述符和使用图卷积网络提取的图特征的各种分子表示的神经网络进行训练,AllCCS2 实现了出色的预测准确性。AllCCS2 在训练集、验证集和测试集中的中位数相对误差(MedRE)值分别为 0.31%、0.72%和 1.64%,在准确性和覆盖范围方面均超过了现有的 CCS 预测工具。此外,AllCCS2 与不同的仪器平台(DTIMS、TWIMS 和 TIMS)具有出色的兼容性。通过使用代表性的结构相似性和模型预测变化,全面研究了 AllCCS2 中来自训练数据和预测模型的预测不确定性。值得注意的是,与训练集具有较高结构相似性且模型预测变化较低的小分子具有更高的准确性和更低的相对误差。总之,AllCCS2 是支持 IM-MS 技术应用的有价值资源。AllCCS2 数据库和工具可在 http://allccs.zhulab.cn/ 免费访问。