• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SIMPLE:基于分子峰的稀疏相互作用模型,用于从串联质谱中快速、可解释地鉴定代谢物。

SIMPLE: Sparse Interaction Model over Peaks of moLEcules for fast, interpretable metabolite identification from tandem mass spectra.

机构信息

Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Japan.

Department of Computer Science, Alato University, Espoo, Finland.

出版信息

Bioinformatics. 2018 Jul 1;34(13):i323-i332. doi: 10.1093/bioinformatics/bty252.

DOI:10.1093/bioinformatics/bty252
PMID:29950009
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6022642/
Abstract

MOTIVATION

Recent success in metabolite identification from tandem mass spectra has been led by machine learning, which has two stages: mapping mass spectra to molecular fingerprint vectors and then retrieving candidate molecules from the database. In the first stage, i.e. fingerprint prediction, spectrum peaks are features and considering their interactions would be reasonable for more accurate identification of unknown metabolites. Existing approaches of fingerprint prediction are based on only individual peaks in the spectra, without explicitly considering the peak interactions. Also the current cutting-edge method is based on kernels, which are computationally heavy and difficult to interpret.

RESULTS

We propose two learning models that allow to incorporate peak interactions for fingerprint prediction. First, we extend the state-of-the-art kernel learning method by developing kernels for peak interactions to combine with kernels for peaks through multiple kernel learning (MKL). Second, we formulate a sparse interaction model for metabolite peaks, which we call SIMPLE, which is computationally light and interpretable for fingerprint prediction. The formulation of SIMPLE is convex and guarantees global optimization, for which we develop an alternating direction method of multipliers (ADMM) algorithm. Experiments using the MassBank dataset show that both models achieved comparative prediction accuracy with the current top-performance kernel method. Furthermore SIMPLE clearly revealed individual peaks and peak interactions which contribute to enhancing the performance of fingerprint prediction.

AVAILABILITY AND IMPLEMENTATION

The code will be accessed through http://mamitsukalab.org/tools/SIMPLE/.

摘要

动机

最近,基于机器学习的串联质谱代谢产物鉴定取得了成功,它有两个阶段:将质谱映射到分子指纹向量,然后从数据库中检索候选分子。在第一阶段,即指纹预测中,谱峰是特征,如果考虑它们的相互作用,对于更准确地识别未知代谢物是合理的。现有的指纹预测方法仅基于谱中的单个峰,而没有明确考虑峰相互作用。此外,目前的最先进方法基于核,计算量大且难以解释。

结果

我们提出了两种学习模型,允许为指纹预测纳入峰相互作用。首先,我们通过开发用于峰相互作用的核来扩展最先进的核学习方法,通过多核学习(MKL)将核与峰结合起来。其次,我们为代谢物峰制定了一个稀疏相互作用模型,我们称之为 SIMPLE,它用于指纹预测计算量轻且可解释。SIMPLE 的公式是凸的,并保证全局优化,我们为此开发了交替方向乘子法(ADMM)算法。使用 MassBank 数据集的实验表明,这两种模型都达到了与当前性能最佳核方法相当的预测精度。此外,SIMPLE 清楚地揭示了单个峰和峰相互作用,有助于提高指纹预测的性能。

可用性和实现

代码将通过 http://mamitsukalab.org/tools/SIMPLE/ 访问。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18e7/6022642/1013c9e7e2d3/bty252f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18e7/6022642/0decbdb814a0/bty252f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18e7/6022642/5c1fc9e51a67/bty252f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18e7/6022642/a39982a4964f/bty252f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18e7/6022642/1013c9e7e2d3/bty252f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18e7/6022642/0decbdb814a0/bty252f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18e7/6022642/5c1fc9e51a67/bty252f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18e7/6022642/a39982a4964f/bty252f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18e7/6022642/1013c9e7e2d3/bty252f4.jpg

相似文献

1
SIMPLE: Sparse Interaction Model over Peaks of moLEcules for fast, interpretable metabolite identification from tandem mass spectra.SIMPLE:基于分子峰的稀疏相互作用模型,用于从串联质谱中快速、可解释地鉴定代谢物。
Bioinformatics. 2018 Jul 1;34(13):i323-i332. doi: 10.1093/bioinformatics/bty252.
2
ADAPTIVE: leArning DAta-dePendenT, concIse molecular VEctors for fast, accurate metabolite identification from tandem mass spectra.ADAPTIVE:从串联质谱中快速、准确识别代谢物的学习数据依赖、简洁的分子向量。
Bioinformatics. 2019 Jul 15;35(14):i164-i172. doi: 10.1093/bioinformatics/btz319.
3
Metabolite identification through multiple kernel learning on fragmentation trees.基于碎裂树的多核学习进行代谢产物鉴定。
Bioinformatics. 2014 Jun 15;30(12):i157-64. doi: 10.1093/bioinformatics/btu275.
4
Fast metabolite identification with Input Output Kernel Regression.使用输入输出核回归进行快速代谢物鉴定。
Bioinformatics. 2016 Jun 15;32(12):i28-i36. doi: 10.1093/bioinformatics/btw246.
5
Metabolite identification and molecular fingerprint prediction through machine learning.通过机器学习进行代谢产物鉴定和分子指纹预测。
Bioinformatics. 2012 Sep 15;28(18):2333-41. doi: 10.1093/bioinformatics/bts437. Epub 2012 Jul 18.
6
Deep kernel learning improves molecular fingerprint prediction from tandem mass spectra.深度学习提高串联质谱分子指纹预测。
Bioinformatics. 2022 Jun 24;38(Suppl 1):i342-i349. doi: 10.1093/bioinformatics/btac260.
7
MIDAS: a database-searching algorithm for metabolite identification in metabolomics.MIDAS:一种用于代谢组学中代谢物鉴定的数据库搜索算法。
Anal Chem. 2014 Oct 7;86(19):9496-503. doi: 10.1021/ac5014783. Epub 2014 Sep 11.
8
Identification of metabolites from tandem mass spectra with a machine learning approach utilizing structural features.利用机器学习方法结合结构特征鉴定串联质谱中的代谢物。
Bioinformatics. 2020 Feb 15;36(4):1213-1218. doi: 10.1093/bioinformatics/btz736.
9
CFM-ID: a web server for annotation, spectrum prediction and metabolite identification from tandem mass spectra.CFM-ID:一个用于串联质谱注释、谱预测和代谢物鉴定的网络服务器。
Nucleic Acids Res. 2014 Jul;42(Web Server issue):W94-9. doi: 10.1093/nar/gku436. Epub 2014 Jun 3.
10
Improved Small Molecule Identification through Learning Combinations of Kernel Regression Models.通过学习核回归模型的组合改进小分子识别
Metabolites. 2019 Aug 1;9(8):160. doi: 10.3390/metabo9080160.

引用本文的文献

1
Machine learning for metabolic pathway optimization: A review.用于代谢途径优化的机器学习:综述
Comput Struct Biotechnol J. 2023 Mar 27;21:2381-2393. doi: 10.1016/j.csbj.2023.03.045. eCollection 2023.
2
Data-Driven Compound Identification in Atmospheric Mass Spectrometry.大气质谱中数据驱动的化合物识别
Adv Sci (Weinh). 2024 Feb;11(8):e2306235. doi: 10.1002/advs.202306235. Epub 2023 Dec 14.
3
Deep Learning Based Metabolite Annotation.基于深度学习的代谢物注释

本文引用的文献

1
Searching molecular structure databases with tandem mass spectra using CSI:FingerID.使用CSI:FingerID通过串联质谱搜索分子结构数据库。
Proc Natl Acad Sci U S A. 2015 Oct 13;112(41):12580-5. doi: 10.1073/pnas.1509788112. Epub 2015 Sep 21.
2
Relevance Vector Machines: Sparse Classification Methods for QSAR.相关向量机:定量构效关系的稀疏分类方法
J Chem Inf Model. 2015 Aug 24;55(8):1529-34. doi: 10.1021/acs.jcim.5b00261. Epub 2015 Jul 21.
3
SCALABLE FUSED LASSO SVM FOR CONNECTOME-BASED DISEASE PREDICTION.用于基于脑连接组的疾病预测的可扩展融合套索支持向量机
Annu Int Conf IEEE Eng Med Biol Soc. 2023 Jul;2023:1-4. doi: 10.1109/EMBC40787.2023.10341007.
4
Strategies for structure elucidation of small molecules based on LC-MS/MS data from complex biological samples.基于复杂生物样品的液相色谱-串联质谱数据解析小分子结构的策略。
Comput Struct Biotechnol J. 2022 Sep 7;20:5085-5097. doi: 10.1016/j.csbj.2022.09.004. eCollection 2022.
5
Machine learning for identification of silylated derivatives from mass spectra.用于从质谱图中识别硅烷化衍生物的机器学习
J Cheminform. 2022 Sep 15;14(1):62. doi: 10.1186/s13321-022-00636-1.
6
Convolutional Neural Network-Based Compound Fingerprint Prediction for Metabolite Annotation.基于卷积神经网络的代谢物注释复合指纹预测
Metabolites. 2022 Jun 29;12(7):605. doi: 10.3390/metabo12070605.
7
Probabilistic framework for integration of mass spectrum and retention time information in small molecule identification.小分子鉴定中质谱和保留时间信息集成的概率框架。
Bioinformatics. 2021 Jul 19;37(12):1724-1731. doi: 10.1093/bioinformatics/btaa998.
8
MetFID: artificial neural network-based compound fingerprint prediction for metabolite annotation.MetFID:基于人工神经网络的化合物指纹预测代谢物注释。
Metabolomics. 2020 Sep 30;16(10):104. doi: 10.1007/s11306-020-01726-7.
9
ADAPTIVE: leArning DAta-dePendenT, concIse molecular VEctors for fast, accurate metabolite identification from tandem mass spectra.ADAPTIVE:从串联质谱中快速、准确识别代谢物的学习数据依赖、简洁的分子向量。
Bioinformatics. 2019 Jul 15;35(14):i164-i172. doi: 10.1093/bioinformatics/btz319.
10
Improved Small Molecule Identification through Learning Combinations of Kernel Regression Models.通过学习核回归模型的组合改进小分子识别
Metabolites. 2019 Aug 1;9(8):160. doi: 10.3390/metabo9080160.
Proc IEEE Int Conf Acoust Speech Signal Process. 2014 May;2014:5989-5993. doi: 10.1109/ICASSP.2014.6854753.
4
Metabolite identification through multiple kernel learning on fragmentation trees.基于碎裂树的多核学习进行代谢产物鉴定。
Bioinformatics. 2014 Jun 15;30(12):i157-64. doi: 10.1093/bioinformatics/btu275.
5
Computational mass spectrometry for small molecules.计算质谱法用于小分子。
J Cheminform. 2013 Mar 1;5(1):12. doi: 10.1186/1758-2946-5-12.
6
HMDB 3.0--The Human Metabolome Database in 2013.HMDB 3.0——2013 年的人类代谢物数据库。
Nucleic Acids Res. 2013 Jan;41(Database issue):D801-7. doi: 10.1093/nar/gks1065. Epub 2012 Nov 17.
7
Metabolite identification and molecular fingerprint prediction through machine learning.通过机器学习进行代谢产物鉴定和分子指纹预测。
Bioinformatics. 2012 Sep 15;28(18):2333-41. doi: 10.1093/bioinformatics/bts437. Epub 2012 Jul 18.
8
Open Babel: An open chemical toolbox.Open Babel:一个开放的化学工具箱。
J Cheminform. 2011 Oct 7;3:33. doi: 10.1186/1758-2946-3-33.
9
Computing fragmentation trees from tandem mass spectrometry data.从串联质谱数据中计算碎片树。
Anal Chem. 2011 Feb 15;83(4):1243-51. doi: 10.1021/ac101825k. Epub 2010 Dec 23.
10
MassBank: a public repository for sharing mass spectral data for life sciences.MassBank:一个用于共享生命科学领域质谱数据的公共数据库。
J Mass Spectrom. 2010 Jul;45(7):703-14. doi: 10.1002/jms.1777.