Suppr超能文献

大规模预测代谢物在离子淌度-质谱中的碰撞截面值。

Large-Scale Prediction of Collision Cross-Section Values for Metabolites in Ion Mobility-Mass Spectrometry.

机构信息

Interdisciplinary Research Center on Biology and Chemistry and Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences , Shanghai, 200032 P. R. China.

出版信息

Anal Chem. 2016 Nov 15;88(22):11084-11091. doi: 10.1021/acs.analchem.6b03091. Epub 2016 Nov 1.

Abstract

The rapid development of metabolomics has significantly advanced health and disease related research. However, metabolite identification remains a major analytical challenge for untargeted metabolomics. While the use of collision cross-section (CCS) values obtained in ion mobility-mass spectrometry (IM-MS) effectively increases identification confidence of metabolites, it is restricted by the limited number of available CCS values for metabolites. Here, we demonstrated the use of a machine-learning algorithm called support vector regression (SVR) to develop a prediction method that utilized 14 common molecular descriptors to predict CCS values for metabolites. In this work, we first experimentally measured CCS values (Ω) of ∼400 metabolites in nitrogen buffer gas and used these values as training data to optimize the prediction method. The high prediction precision of this method was externally validated using an independent set of metabolites with a median relative error (MRE) of ∼3%, better than conventional theoretical calculation. Using the SVR based prediction method, a large-scale predicted CCS database was generated for 35 203 metabolites in the Human Metabolome Database (HMDB). For each metabolite, five different ion adducts in positive and negative modes were predicted, accounting for 176 015 CCS values in total. Finally, improved metabolite identification accuracy was demonstrated using real biological samples. Conclusively, our results proved that the SVR based prediction method can accurately predict nitrogen CCS values (Ω) of metabolites from molecular descriptors and effectively improve identification accuracy and efficiency in untargeted metabolomics. The predicted CCS database, namely, MetCCS, is freely available on the Internet.

摘要

代谢组学的快速发展极大地推动了健康和疾病相关研究。然而,代谢物的鉴定仍然是无靶向代谢组学分析的主要挑战。虽然在离子淌度-质谱联用(IM-MS)中使用碰撞截面(CCS)值可以有效地提高代谢物的鉴定置信度,但它受到代谢物可用 CCS 值数量的限制。在这里,我们展示了一种称为支持向量回归(SVR)的机器学习算法的应用,该算法用于开发一种预测方法,该方法利用 14 种常见的分子描述符来预测代谢物的 CCS 值。在这项工作中,我们首先在氮气缓冲气体中实验测量了约 400 种代谢物的 CCS 值(Ω),并将这些值用作训练数据来优化预测方法。该方法的高预测精度通过使用一组独立的代谢物进行外部验证,其中位数相对误差(MRE)约为 3%,优于传统的理论计算。使用基于 SVR 的预测方法,为人类代谢组数据库(HMDB)中的 35203 种代谢物生成了一个大规模的预测 CCS 数据库。对于每种代谢物,预测了正、负模式下的五种不同的离子加合物,总共预测了 176015 个 CCS 值。最后,使用真实的生物样本证明了提高代谢物鉴定准确性的效果。总之,我们的结果证明,基于 SVR 的预测方法可以从分子描述符准确预测代谢物的氮气 CCS 值(Ω),并有效地提高无靶向代谢组学中的鉴定准确性和效率。该预测的 CCS 数据库,即 MetCCS,可在互联网上免费获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验