• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

JESTR:用于对非靶向代谢组学数据注释的候选分子进行排名的联合嵌入空间技术。

JESTR: Joint Embedding Space Technique for Ranking candidate molecules for the annotation of untargeted metabolomics data.

作者信息

Kalia Apurva, Zhou Chen Yan, Krishnan Dilip, Hassoun Soha

机构信息

Department of Computer Science, Tufts University, Medford, MA 02155, United States.

Google DeepMind, Mountain View, CA 94043, United States.

出版信息

Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf354.

DOI:10.1093/bioinformatics/btaf354
PMID:40574677
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12233093/
Abstract

MOTIVATION

A major challenge in metabolomics is annotation: assigning molecular structures to mass spectral fragmentation patterns. Despite recent advances in molecule-to-spectra and in spectra-to-molecular fingerprint (FP) prediction, annotation rates remain low.

RESULTS

We introduce in this article a novel tool (JESTR) for annotation. Unlike prior approaches that "explicitly" construct molecular FPs or spectra, JESTR leverages the insight that molecules and their corresponding spectra are views of the same data and effectively embeds their representations in a joint space. Candidate structures are ranked based on cosine similarity between the embeddings of query spectrum and each candidate. We evaluate JESTR against mol-to-spec, spec-to-FP, and spec-mol matching annotation tools on four datasets. On average, for rank@[1-20], JESTR outperforms other tools by 55.5%-302.6%. We further demonstrate the strong value of regularization with candidate molecules during training, boosting rank@1 performance by 5.72% across all datasets and enhancing the model's ability to discern between target and candidate molecules. When comparing JESTR's performance against that of publicly available pretrained models of SIRIUS and CFM-ID on appropriate subsets of MassSpecGym dataset, JESTR outperforms these tools by 31% and 238%, respectively. Through JESTR, we offer a novel promising avenue toward accurate annotation, therefore unlocking valuable insights into the metabolome.

AVAILABILITY AND IMPLEMENTATION

Code and dataset available at https://github.com/HassounLab/JESTR1/.

摘要

动机

代谢组学中的一个主要挑战是注释,即将分子结构与质谱碎片模式进行匹配。尽管在分子到光谱以及光谱到分子指纹(FP)预测方面取得了最新进展,但注释率仍然很低。

结果

我们在本文中介绍了一种用于注释的新型工具(JESTR)。与之前“明确”构建分子FP或光谱的方法不同,JESTR利用了分子及其相应光谱是同一数据的不同视图这一见解,并有效地将它们的表示嵌入到一个联合空间中。候选结构根据查询光谱与每个候选结构的嵌入之间的余弦相似度进行排序。我们在四个数据集上针对分子到光谱、光谱到FP以及光谱-分子匹配注释工具对JESTR进行了评估。平均而言,对于排名@[1-20],JESTR比其他工具的性能高出55.5%-302.6%。我们进一步证明了在训练期间使用候选分子进行正则化的强大价值,在所有数据集上使排名@1的性能提高了5.72%,并增强了模型区分目标分子和候选分子的能力。当在MassSpecGym数据集的适当子集上比较JESTR与公开可用的SIRIUS和CFM-ID预训练模型的性能时,JESTR分别比这些工具高出31%和238%。通过JESTR,我们提供了一条通往准确注释的新的有前景的途径,从而揭示代谢组中的宝贵见解。

可用性和实现方式

代码和数据集可在https://github.com/HassounLab/JESTR1/获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/38e5/12233093/7d96555a8f51/btaf354f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/38e5/12233093/36e1323c32d1/btaf354f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/38e5/12233093/c520c9208768/btaf354f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/38e5/12233093/b597ae740cdb/btaf354f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/38e5/12233093/472de46d14a0/btaf354f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/38e5/12233093/7d96555a8f51/btaf354f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/38e5/12233093/36e1323c32d1/btaf354f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/38e5/12233093/c520c9208768/btaf354f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/38e5/12233093/b597ae740cdb/btaf354f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/38e5/12233093/472de46d14a0/btaf354f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/38e5/12233093/7d96555a8f51/btaf354f5.jpg

相似文献

1
JESTR: Joint Embedding Space Technique for Ranking candidate molecules for the annotation of untargeted metabolomics data.JESTR:用于对非靶向代谢组学数据注释的候选分子进行排名的联合嵌入空间技术。
Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf354.
2
JESTR: Joint Embedding Space Technique for Ranking Candidate Molecules for the Annotation of Untargeted Metabolomics Data.JESTR:用于对非靶向代谢组学数据注释的候选分子进行排序的联合嵌入空间技术。
ArXiv. 2024 Nov 25:arXiv:2411.14464v2.
3
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗?
Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.
4
Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.利用基础模型库进行跨设备肿瘤显微镜检查中的细胞相似性搜索。
Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.
5
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病:网络荟萃分析。
Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.
6
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.慢性斑块状银屑病的全身药理学治疗:一项网状荟萃分析。
Cochrane Database Syst Rev. 2017 Dec 22;12(12):CD011535. doi: 10.1002/14651858.CD011535.pub2.
7
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
8
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.慢性斑块状银屑病的全身药理学治疗:一项网状Meta分析。
Cochrane Database Syst Rev. 2020 Jan 9;1(1):CD011535. doi: 10.1002/14651858.CD011535.pub3.
9
MixingDTA: improved drug-target affinity prediction by extending mixup with guilt-by-association.MixingDTA:通过关联负罪感扩展混合增强来改进药物-靶点亲和力预测
Bioinformatics. 2025 Jul 1;41(Supplement_1):i105-i114. doi: 10.1093/bioinformatics/btaf238.
10
Sexual Harassment and Prevention Training性骚扰与预防培训

本文引用的文献

1
Self-supervised learning of molecular representations from millions of tandem mass spectra using DreaMS.使用DreaMS从数百万个串联质谱中进行分子表征的自监督学习。
Nat Biotechnol. 2025 May 23. doi: 10.1038/s41587-025-02663-3.
2
Molecular Structure Discovery for Untargeted Metabolomics Using Biotransformation Rules and Global Molecular Networking.利用生物转化规则和全局分子网络进行非靶向代谢组学的分子结构发现
Anal Chem. 2025 Feb 18;97(6):3213-3219. doi: 10.1021/acs.analchem.4c01565. Epub 2025 Feb 4.
3
CMSSP: A Contrastive Mass Spectra-Structure Pretraining Model for Metabolite Identification.
CMSSP:一种用于代谢物鉴定的对比质谱-结构预训练模型。
Anal Chem. 2024 Oct 22;96(42):16871-16881. doi: 10.1021/acs.analchem.4c03724. Epub 2024 Oct 14.
4
MSBERT: Embedding Tandem Mass Spectra into Chemically Rational Space by Mask Learning and Contrastive Learning.MSBERT:通过掩码学习和对比学习将串联质谱嵌入化学合理空间
Anal Chem. 2024 Oct 22;96(42):16599-16608. doi: 10.1021/acs.analchem.4c02426. Epub 2024 Oct 14.
5
An Ensemble Spectral Prediction (ESP) model for metabolite annotation.用于代谢物注释的集成谱预测 (ESP) 模型。
Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae490.
6
ipaPy2: Integrated Probabilistic Annotation (IPA) 2.0-an improved Bayesian-based method for the annotation of LC-MS/MS untargeted metabolomics data.ipaPy2:集成概率标注(IPA)2.0——一种改进的基于贝叶斯的 LC-MS/MS 非靶向代谢组学数据标注方法。
Bioinformatics. 2023 Jul 1;39(7). doi: 10.1093/bioinformatics/btad455.
7
An end-to-end deep learning framework for translating mass spectra to de-novo molecules.一种用于将质谱图翻译为从头合成分子的端到端深度学习框架。
Commun Chem. 2023 Jun 23;6(1):132. doi: 10.1038/s42004-023-00932-3.
8
MSNovelist: de novo structure generation from mass spectra.MSNovelist:从头开始从质谱生成结构。
Nat Methods. 2022 Jul;19(7):865-870. doi: 10.1038/s41592-022-01486-3. Epub 2022 May 30.
9
CFM-ID 4.0: More Accurate ESI-MS/MS Spectral Prediction and Compound Identification.CFM-ID 4.0:更准确的 ESI-MS/MS 谱预测和化合物鉴定。
Anal Chem. 2021 Aug 31;93(34):11692-11700. doi: 10.1021/acs.analchem.1c01465. Epub 2021 Aug 17.
10
Spec2Vec: Improved mass spectral similarity scoring through learning of structural relationships.Spec2Vec:通过学习结构关系提高质谱相似性评分。
PLoS Comput Biol. 2021 Feb 16;17(2):e1008724. doi: 10.1371/journal.pcbi.1008724. eCollection 2021 Feb.