Suppr超能文献

深度学习驱动的气相色谱-质谱联用库检索及其在代谢组学中的应用

Deep Learning Driven GC-MS Library Search and Its Application for Metabolomics.

作者信息

Matyushin Dmitriy D, Sholokhova Anastasia Yu, Buryak Aleksey K

机构信息

A.N. Frumkin Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, 31 Leninsky Prospect, Moscow, GSP-1, 119071, Russia.

出版信息

Anal Chem. 2020 Sep 1;92(17):11818-11825. doi: 10.1021/acs.analchem.0c02082. Epub 2020 Aug 12.

Abstract

Preliminary compound identification and peak annotation in gas chromatography-mass spectrometry is usually made using mass spectral databases. There are a few algorithms that enable performing a search of a spectrum in a large mass spectral library. In many cases, a library search procedure returns a wrong answer even if a correct compound is contained in a library. In this work, we present a deep learning driven approach to a library search in order to reduce the probability of such cases. Machine learning ranking (learning to rank) is a class of machine learning and deep learning algorithms that perform a comparison (ranking) of objects. This work introduces the usage of deep learning ranking for small molecules identification using low-resolution electron ionization mass spectrometry. Instead of simple similarity measures for two spectra, such as the dot product or the Euclidean distance between vectors that represent spectra, a deep convolutional neural network is used. The deep learning ranking model outperforms other approaches and enables reducing a fraction of wrong answers (at rank-1) by 9-23% depending on the used data set. Spectra from the Golm Metabolome Database, Human Metabolome Database, and FiehnLib were used for testing the model.

摘要

气相色谱 - 质谱联用中的初步化合物鉴定和峰注释通常使用质谱数据库进行。有一些算法能够在大型质谱库中搜索光谱。在许多情况下,即使库中包含正确的化合物,库搜索程序也会返回错误答案。在这项工作中,我们提出了一种深度学习驱动的库搜索方法,以降低此类情况发生的概率。机器学习排序(学习排序)是一类进行对象比较(排序)的机器学习和深度学习算法。这项工作介绍了使用深度学习排序通过低分辨率电子电离质谱法鉴定小分子的方法。使用深度卷积神经网络,而不是像代表光谱的向量之间的点积或欧几里得距离这样的两个光谱的简单相似性度量。深度学习排序模型优于其他方法,根据所使用的数据集,能够将错误答案的比例(在排名第一时)降低9 - 23%。来自戈尔姆代谢组数据库、人类代谢组数据库和菲恩库的光谱用于测试该模型。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验