Suppr超能文献

基于对比学习的串联质谱特征表示嵌入方法。

Contrastive Learning-Based Embedder for the Representation of Tandem Mass Spectra.

机构信息

Hangzhou Hikvision Digital Technology Co. Ltd, Hangzhou 310051, P. R. China.

出版信息

Anal Chem. 2023 May 23;95(20):7888-7896. doi: 10.1021/acs.analchem.3c00260. Epub 2023 May 12.

Abstract

Tandem mass spectrometry (MS/MS) shows great promise in the research of metabolomics, providing an abundance of information on compounds. Due to the rapid development of mass spectrometric techniques, a large number of MS/MS spectral data sets have been produced from different experimental environments. The massive data brings great challenges into the spectral analysis including compound identification and spectra clustering. The core challenge in MS/MS spectral analysis is how to describe a spectrum more quantitatively and effectively. Recently, emerging deep-learning-based technologies have brought new opportunities to handle this challenge in which high-quality descriptions of MS/MS spectra can be obtained. In this study, we propose a novel contrastive learning-based method for the representation of MS/MS spectra, called CLERMS, which is based on transformer architecture. Specifically, an optimized model architecture equipped with a sinusoidal embedder and a novel loss function composed of InfoNCE loss and MSE loss has been proposed for the attainment of good embedding from the peak information and the metadata. We evaluate our method using a GNPS data set, and the results demonstrate that the learned embedding can not only distinguish spectra from different compounds but also reveal the structural similarity between them. Additionally, the comparison between our method and other methods on the performance of compound identification and spectra clustering shows that our method can achieve significantly better results.

摘要

串联质谱(MS/MS)在代谢组学研究中具有广阔的前景,为化合物提供了丰富的信息。由于质谱技术的快速发展,已经从不同的实验环境中产生了大量的 MS/MS 光谱数据集。海量的数据给光谱分析带来了巨大的挑战,包括化合物鉴定和光谱聚类。MS/MS 光谱分析的核心挑战是如何更定量、更有效地描述一个光谱。最近,新兴的基于深度学习的技术为处理这一挑战带来了新的机遇,可以获得 MS/MS 光谱的高质量描述。在这项研究中,我们提出了一种基于对比学习的 MS/MS 光谱表示方法,称为 CLERMS,它基于变压器结构。具体来说,我们提出了一种优化的模型架构,配备了正弦嵌入器和由 InfoNCE 损失和 MSE 损失组成的新颖损失函数,以从峰信息和元数据中获得良好的嵌入。我们使用 GNPS 数据集评估我们的方法,结果表明,所学习的嵌入不仅可以区分来自不同化合物的光谱,还可以揭示它们之间的结构相似性。此外,我们的方法与其他方法在化合物鉴定和光谱聚类性能上的比较表明,我们的方法可以取得显著更好的结果。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验