Suppr超能文献

基于机器学习的保留时间预测的 METLIN 小分子数据集。

The METLIN small molecule dataset for machine learning-based retention time prediction.

机构信息

Scripps Center for Metabolomics, The Scripps Research Institute, La Jolla, CA, USA.

Centre for Omic Sciences, EURECAT - Technology Centre of Catalonia & Rovira i Virgili University joint unit, Reus, Catalonia, Spain.

出版信息

Nat Commun. 2019 Dec 20;10(1):5811. doi: 10.1038/s41467-019-13680-7.

Abstract

Machine learning has been extensively applied in small molecule analysis to predict a wide range of molecular properties and processes including mass spectrometry fragmentation or chromatographic retention time. However, current approaches for retention time prediction lack sufficient accuracy due to limited available experimental data. Here we introduce the METLIN small molecule retention time (SMRT) dataset, an experimentally acquired reverse-phase chromatography retention time dataset covering up to 80,038 small molecules. To demonstrate the utility of this dataset, we deployed a deep learning model for retention time prediction applied to small molecule annotation. Results showed that in 70[Formula: see text] of the cases, the correct molecular identity was ranked among the top 3 candidates based on their predicted retention time. We anticipate that this dataset will enable the community to apply machine learning or first principles strategies to generate better models for retention time prediction.

摘要

机器学习已被广泛应用于小分子分析,以预测广泛的分子性质和过程,包括质谱碎裂或色谱保留时间。然而,由于可用的实验数据有限,当前的保留时间预测方法准确性不足。在这里,我们介绍 METLIN 小分子保留时间 (SMRT) 数据集,这是一个实验获得的反相色谱保留时间数据集,涵盖了多达 80038 个小分子。为了展示这个数据集的实用性,我们部署了一个用于小分子注释的保留时间预测的深度学习模型。结果表明,在 70%的情况下,根据预测的保留时间,正确的分子身份排名在前 3 名候选者之列。我们预计,这个数据集将使社区能够应用机器学习或第一性原理策略,以生成更好的保留时间预测模型。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验