• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过机器学习进行代谢物鉴定——使用FingerID应对CASMI挑战

Metabolite Identification through Machine Learning- Tackling CASMI Challenge Using FingerID.

作者信息

Shen Huibin, Zamboni Nicola, Heinonen Markus, Rousu Juho

机构信息

Helsinki Institute for Information Technology HIIT; Department of Information and Computer Science, Aalto University, Konemiehentie 2, FI-02150 Espoo, Finland;.

Institute of Molecular Systems Biology, ETH Zürich, Wolfgang-Pauli Street 16, 8093 Zürich, Switzerland.

出版信息

Metabolites. 2013 Jun 6;3(2):484-505. doi: 10.3390/metabo3020484.

DOI:10.3390/metabo3020484
PMID:24958002
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3901273/
Abstract

Metabolite identification is a major bottleneck in metabolomics due to the number and diversity of the molecules. To alleviate this bottleneck, computational methods and tools that reliably filter the set of candidates are needed for further analysis by human experts. Recent efforts in assembling large public mass spectral databases such as MassBank have opened the door for developing a new genre of metabolite identification methods that rely on machine learning as the primary vehicle for identification. In this paper we describe the machine learning approach used in FingerID, its application to the CASMI challenges and some results that were not part of our challenge submission. In short, FingerID learns to predict molecular fingerprints from a large collection of MS/MS spectra, and uses the predicted fingerprints to retrieve and rank candidate molecules from a given large molecular database. Furthermore, we introduce a web server for FingerID, which was applied for the first time to the CASMI challenges. The challenge results show that the new machine learning framework produces competitive results on those challenge molecules that were found within the relatively restricted KEGG compound database. Additional experiments on the PubChem database confirm the feasibility of the approach even on a much larger database, although room for improvement still remains.

摘要

由于代谢物分子的数量众多且种类多样,代谢物鉴定是代谢组学中的一个主要瓶颈。为了缓解这一瓶颈,需要可靠地筛选候选物集的计算方法和工具,以便人类专家进行进一步分析。最近在组装大型公共质谱数据库(如MassBank)方面所做的努力,为开发一种依赖机器学习作为主要鉴定手段的新型代谢物鉴定方法打开了大门。在本文中,我们描述了FingerID中使用的机器学习方法、其在CASMI挑战赛中的应用以及一些未包含在我们挑战赛提交内容中的结果。简而言之,FingerID学习从大量的MS/MS光谱中预测分子指纹,并使用预测的指纹从给定的大型分子数据库中检索候选分子并对其进行排名。此外,我们还介绍了一个用于FingerID的网络服务器,该服务器首次应用于CASMI挑战赛。挑战赛结果表明,新的机器学习框架在相对受限的KEGG化合物数据库中找到的那些挑战分子上产生了具有竞争力的结果。在PubChem数据库上进行的额外实验证实了该方法即使在更大的数据库上也是可行的,尽管仍有改进的空间。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/84035e058637/metabolites-03-00484-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/51e5a2c2c89b/metabolites-03-00484-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/04ef6151b034/metabolites-03-00484-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/6b251d027a47/metabolites-03-00484-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/b05178a930d9/metabolites-03-00484-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/c5f066c441f9/metabolites-03-00484-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/da1b5e3faa9d/metabolites-03-00484-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/27270050312b/metabolites-03-00484-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/596af0e9c97d/metabolites-03-00484-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/a0239063d21e/metabolites-03-00484-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/84035e058637/metabolites-03-00484-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/51e5a2c2c89b/metabolites-03-00484-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/04ef6151b034/metabolites-03-00484-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/6b251d027a47/metabolites-03-00484-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/b05178a930d9/metabolites-03-00484-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/c5f066c441f9/metabolites-03-00484-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/da1b5e3faa9d/metabolites-03-00484-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/27270050312b/metabolites-03-00484-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/596af0e9c97d/metabolites-03-00484-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/a0239063d21e/metabolites-03-00484-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/307a/3901273/84035e058637/metabolites-03-00484-g010.jpg

相似文献

1
Metabolite Identification through Machine Learning- Tackling CASMI Challenge Using FingerID.通过机器学习进行代谢物鉴定——使用FingerID应对CASMI挑战
Metabolites. 2013 Jun 6;3(2):484-505. doi: 10.3390/metabo3020484.
2
Metabolite identification and molecular fingerprint prediction through machine learning.通过机器学习进行代谢产物鉴定和分子指纹预测。
Bioinformatics. 2012 Sep 15;28(18):2333-41. doi: 10.1093/bioinformatics/bts437. Epub 2012 Jul 18.
3
Convolutional Neural Network-Based Compound Fingerprint Prediction for Metabolite Annotation.基于卷积神经网络的代谢物注释复合指纹预测
Metabolites. 2022 Jun 29;12(7):605. doi: 10.3390/metabo12070605.
4
MetFID: artificial neural network-based compound fingerprint prediction for metabolite annotation.MetFID:基于人工神经网络的化合物指纹预测代谢物注释。
Metabolomics. 2020 Sep 30;16(10):104. doi: 10.1007/s11306-020-01726-7.
5
Searching molecular structure databases with tandem mass spectra using CSI:FingerID.使用CSI:FingerID通过串联质谱搜索分子结构数据库。
Proc Natl Acad Sci U S A. 2015 Oct 13;112(41):12580-5. doi: 10.1073/pnas.1509788112. Epub 2015 Sep 21.
6
MINEs: open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics.MINEs:用于非靶向代谢组学的计算预测酶多底物催化产物的开放获取数据库。
J Cheminform. 2015 Aug 28;7:44. doi: 10.1186/s13321-015-0087-1. eCollection 2015.
7
Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints.基于分子指纹的代谢物质谱识别的贝叶斯网络。
Bioinformatics. 2018 Jul 1;34(13):i333-i340. doi: 10.1093/bioinformatics/bty245.
8
Software Tools and Approaches for Compound Identification of LC-MS/MS Data in Metabolomics.代谢组学中用于液相色谱-串联质谱数据化合物鉴定的软件工具与方法
Metabolites. 2018 May 10;8(2):31. doi: 10.3390/metabo8020031.
9
Solving CASMI 2013 with MetFrag, MetFusion and MOLGEN-MS/MS.使用MetFrag、MetFusion和MOLGEN-MS/MS解决2013年CASMI问题。
Mass Spectrom (Tokyo). 2014;3(Spec Iss 2):S0036. doi: 10.5702/massspectrometry.S0036. Epub 2014 Aug 16.
10
Combining Experimental with Computational Infrared and Mass Spectra for High-Throughput Nontargeted Chemical Structure Identification.结合实验与计算红外光谱和质谱用于高通量非靶向化学结构鉴定
Anal Chem. 2023 Aug 15;95(32):11901-11907. doi: 10.1021/acs.analchem.3c00937. Epub 2023 Aug 4.

引用本文的文献

1
Bridging Ethnobotanical Knowledge and Multi-Omics Approaches for Plant-Derived Natural Product Discovery.架起民族植物学知识与多组学方法之间的桥梁以发现植物源天然产物
Metabolites. 2025 May 29;15(6):362. doi: 10.3390/metabo15060362.
2
Non-Targeted Metabolomic Analysis of (L.) Heynh: Metabolic Adaptive Responses to Stress Caused by N Starvation.对(L.)Heynh的非靶向代谢组学分析:对氮饥饿引起的胁迫的代谢适应性反应。
Metabolites. 2023 Sep 18;13(9):1021. doi: 10.3390/metabo13091021.
3
Emerging computational paradigms to address the complex role of gut microbial metabolism in cardiovascular diseases.

本文引用的文献

1
A general approach to calculating isotopic distributions for mass spectrometry.一种用于计算质谱同位素分布的通用方法。
J Mass Spectrom. 2020 Aug;55(8):e4498. doi: 10.1002/jms.4498. Epub 2020 May 4.
2
ChemCalc: a building block for tomorrow's chemical infrastructure.ChemCalc:构建明天化学基础设施的基石。
J Chem Inf Model. 2013 May 24;53(5):1223-8. doi: 10.1021/ci300563h. Epub 2013 Apr 30.
3
Metabolite identification and molecular fingerprint prediction through machine learning.通过机器学习进行代谢产物鉴定和分子指纹预测。
新兴的计算范式,以应对肠道微生物代谢在心血管疾病中的复杂作用。
Front Cardiovasc Med. 2022 Oct 10;9:987104. doi: 10.3389/fcvm.2022.987104. eCollection 2022.
4
CFM-ID 4.0: More Accurate ESI-MS/MS Spectral Prediction and Compound Identification.CFM-ID 4.0:更准确的 ESI-MS/MS 谱预测和化合物鉴定。
Anal Chem. 2021 Aug 31;93(34):11692-11700. doi: 10.1021/acs.analchem.1c01465. Epub 2021 Aug 17.
5
Incorporating structural similarity into a scoring function to enhance the prediction of binding affinities.将结构相似性纳入评分函数以增强结合亲和力的预测。
J Cheminform. 2021 Feb 15;13(1):11. doi: 10.1186/s13321-021-00493-4.
6
PubChem in 2021: new data content and improved web interfaces.PubChem 在 2021 年:新增数据内容和改进的网络界面。
Nucleic Acids Res. 2021 Jan 8;49(D1):D1388-D1395. doi: 10.1093/nar/gkaa971.
7
Improved Small Molecule Identification through Learning Combinations of Kernel Regression Models.通过学习核回归模型的组合改进小分子识别
Metabolites. 2019 Aug 1;9(8):160. doi: 10.3390/metabo9080160.
8
Software Tools and Approaches for Compound Identification of LC-MS/MS Data in Metabolomics.代谢组学中用于液相色谱-串联质谱数据化合物鉴定的软件工具与方法
Metabolites. 2018 May 10;8(2):31. doi: 10.3390/metabo8020031.
9
Machine Learning Methods for Analysis of Metabolic Data and Metabolic Pathway Modeling.用于代谢数据分析和代谢途径建模的机器学习方法
Metabolites. 2018 Jan 11;8(1):4. doi: 10.3390/metabo8010004.
10
Fast metabolite identification with Input Output Kernel Regression.使用输入输出核回归进行快速代谢物鉴定。
Bioinformatics. 2016 Jun 15;32(12):i28-i36. doi: 10.1093/bioinformatics/btw246.
Bioinformatics. 2012 Sep 15;28(18):2333-41. doi: 10.1093/bioinformatics/bts437. Epub 2012 Jul 18.
4
Open Babel: An open chemical toolbox.Open Babel:一个开放的化学工具箱。
J Cheminform. 2011 Oct 7;3:33. doi: 10.1186/1758-2946-3-33.
5
Computational strategies for metabolite identification in metabolomics.代谢组学中代谢物鉴定的计算策略
Bioanalysis. 2009 Dec;1(9):1579-96. doi: 10.4155/bio.09.138.
6
Computational mass spectrometry for metabolomics: identification of metabolites and small molecules.计算质谱学在代谢组学中的应用:代谢物和小分子的鉴定。
Anal Bioanal Chem. 2010 Dec;398(7-8):2779-88. doi: 10.1007/s00216-010-4142-5. Epub 2010 Oct 9.
7
MassBank: a public repository for sharing mass spectral data for life sciences.MassBank:一个用于共享生命科学领域质谱数据的公共数据库。
J Mass Spectrom. 2010 Jul;45(7):703-14. doi: 10.1002/jms.1777.
8
In silico fragmentation for computer assisted identification of metabolite mass spectra.计算机辅助代谢物质谱识别的从头碎片法。
BMC Bioinformatics. 2010 Mar 22;11:148. doi: 10.1186/1471-2105-11-148.
9
Computational methods for metabolic reconstruction.代谢重建的计算方法。
Curr Opin Biotechnol. 2010 Feb;21(1):70-7. doi: 10.1016/j.copbio.2010.01.010. Epub 2010 Feb 18.
10
SIRIUS: decomposing isotope patterns for metabolite identification.天狼星:用于代谢物鉴定的同位素模式分解
Bioinformatics. 2009 Jan 15;25(2):218-24. doi: 10.1093/bioinformatics/btn603. Epub 2008 Nov 17.