Suppr超能文献

通过机器学习进行代谢物鉴定——使用FingerID应对CASMI挑战

Metabolite Identification through Machine Learning- Tackling CASMI Challenge Using FingerID.

作者信息

Shen Huibin, Zamboni Nicola, Heinonen Markus, Rousu Juho

机构信息

Helsinki Institute for Information Technology HIIT; Department of Information and Computer Science, Aalto University, Konemiehentie 2, FI-02150 Espoo, Finland;.

Institute of Molecular Systems Biology, ETH Zürich, Wolfgang-Pauli Street 16, 8093 Zürich, Switzerland.

出版信息

Metabolites. 2013 Jun 6;3(2):484-505. doi: 10.3390/metabo3020484.

Abstract

Metabolite identification is a major bottleneck in metabolomics due to the number and diversity of the molecules. To alleviate this bottleneck, computational methods and tools that reliably filter the set of candidates are needed for further analysis by human experts. Recent efforts in assembling large public mass spectral databases such as MassBank have opened the door for developing a new genre of metabolite identification methods that rely on machine learning as the primary vehicle for identification. In this paper we describe the machine learning approach used in FingerID, its application to the CASMI challenges and some results that were not part of our challenge submission. In short, FingerID learns to predict molecular fingerprints from a large collection of MS/MS spectra, and uses the predicted fingerprints to retrieve and rank candidate molecules from a given large molecular database. Furthermore, we introduce a web server for FingerID, which was applied for the first time to the CASMI challenges. The challenge results show that the new machine learning framework produces competitive results on those challenge molecules that were found within the relatively restricted KEGG compound database. Additional experiments on the PubChem database confirm the feasibility of the approach even on a much larger database, although room for improvement still remains.

摘要

由于代谢物分子的数量众多且种类多样,代谢物鉴定是代谢组学中的一个主要瓶颈。为了缓解这一瓶颈,需要可靠地筛选候选物集的计算方法和工具,以便人类专家进行进一步分析。最近在组装大型公共质谱数据库(如MassBank)方面所做的努力,为开发一种依赖机器学习作为主要鉴定手段的新型代谢物鉴定方法打开了大门。在本文中,我们描述了FingerID中使用的机器学习方法、其在CASMI挑战赛中的应用以及一些未包含在我们挑战赛提交内容中的结果。简而言之,FingerID学习从大量的MS/MS光谱中预测分子指纹,并使用预测的指纹从给定的大型分子数据库中检索候选分子并对其进行排名。此外,我们还介绍了一个用于FingerID的网络服务器,该服务器首次应用于CASMI挑战赛。挑战赛结果表明,新的机器学习框架在相对受限的KEGG化合物数据库中找到的那些挑战分子上产生了具有竞争力的结果。在PubChem数据库上进行的额外实验证实了该方法即使在更大的数据库上也是可行的,尽管仍有改进的空间。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/51e5a2c2c89b/metabolites-03-00484-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验