• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用可解释人工智能扩大质谱查询语言纲要的规模

Increasing the Scale of the Mass Spectrometry Query Language Compendium with Explainable AI.

作者信息

Harwood Thomas V, Wang Mingxun, Northen Trent R, Bowen Benjamin P

机构信息

Joint Genome Institute, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, California 94720, United States.

Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, California 94720, United States.

出版信息

Anal Chem. 2025 Sep 9;97(35):18860-18866. doi: 10.1021/acs.analchem.5c02591. Epub 2025 Aug 25.

DOI:10.1021/acs.analchem.5c02591
PMID:40851452
Abstract

A significant bottleneck in metabolomics data interpretation is the effective use of domain knowledge to assign structural information based on fragmentation patterns. The mass spectrometry query language (MassQL) aims to make this process accessible and applicable across multiple analysis platforms. While advanced computational methods are capable of predicting compound structures from fragmentation data, AI/ML approaches often rely on complex, opaque criteria that are difficult to interpret or modify. As a result, their predictive patterns cannot be readily translated into human-readable rules, such as those used in MassQL. In this study, we introduce ChemEcho, a machine learning embedding method that converts tandem mass spectrometry data into sparse feature vectors containing peak and neutral mass subformulae to enhance explainable AI/ML-based methods. An advantage of this approach is that decision trees trained using these feature vectors can be directly translated to MassQL. Using a battery of decision trees trained using ChemEcho embeddings to predict molecular attributes, we generated over 1500 MassQL queries for 765 molecular features and evaluated their precision and recall. From these queries, the 50 highest-performing queries were integrated into the MassQL compendium. This set of generated MassQL queries included environmentally and biologically relevant classes such as PFAS and molecules containing phosphate or sulfate substructures. To illustrate the impact these queries would have on a typical metabolomics experiment, these MassQL queries were applied to a public metabolomics data set─resulting in a marked increase in the structural information derived from tandem mass spectra. Access and reuse of these queries is expected to enhance structural annotation in untargeted experiments, leading to more specific claims and advancing many applications in metabolomics.

摘要

代谢组学数据解读中的一个重大瓶颈是有效利用领域知识,根据碎片模式来分配结构信息。质谱查询语言(MassQL)旨在使这一过程在多个分析平台上易于实现和应用。虽然先进的计算方法能够从碎片数据预测化合物结构,但人工智能/机器学习方法通常依赖于复杂、不透明的标准,难以解释或修改。因此,它们的预测模式无法轻易转化为人类可读的规则,比如MassQL中使用的规则。在本研究中,我们引入了ChemEcho,一种机器学习嵌入方法,它将串联质谱数据转换为包含峰和中性质量子公式的稀疏特征向量,以增强基于人工智能/机器学习的可解释方法。这种方法的一个优点是,使用这些特征向量训练的决策树可以直接转化为MassQL。我们使用一系列基于ChemEcho嵌入训练的决策树来预测分子属性,针对765个分子特征生成了1500多个MassQL查询,并评估了它们的精确率和召回率。从这些查询中,50个性能最佳的查询被整合到MassQL汇编中。这组生成的MassQL查询包括与环境和生物相关的类别,如全氟和多氟烷基物质以及含有磷酸盐或硫酸盐子结构的分子。为了说明这些查询对典型代谢组学实验的影响,我们将这些MassQL查询应用于一个公共代谢组学数据集,结果串联质谱得出的结构信息显著增加。预计这些查询的获取和重用将增强非靶向实验中的结构注释,从而得出更具体的结论,并推动代谢组学中的许多应用。

相似文献

1
Increasing the Scale of the Mass Spectrometry Query Language Compendium with Explainable AI.利用可解释人工智能扩大质谱查询语言纲要的规模
Anal Chem. 2025 Sep 9;97(35):18860-18866. doi: 10.1021/acs.analchem.5c02591. Epub 2025 Aug 25.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Plug-and-play use of tree-based methods: consequences for clinical prediction modeling.基于树的方法的即插即用:对临床预测模型的影响。
J Clin Epidemiol. 2025 Aug;184:111834. doi: 10.1016/j.jclinepi.2025.111834. Epub 2025 May 19.
4
Short-Term Memory Impairment短期记忆障碍
5
Aspects of Genetic Diversity, Host Specificity and Public Health Significance of Single-Celled Intestinal Parasites Commonly Observed in Humans and Mostly Referred to as 'Non-Pathogenic'.人类常见且大多被称为“非致病性”的单细胞肠道寄生虫的遗传多样性、宿主特异性及公共卫生意义
APMIS. 2025 Sep;133(9):e70036. doi: 10.1111/apm.70036.
6
The Lived Experience of Autistic Adults in Employment: A Systematic Search and Synthesis.成年自闭症患者的就业生活经历:系统检索与综述
Autism Adulthood. 2024 Dec 2;6(4):495-509. doi: 10.1089/aut.2022.0114. eCollection 2024 Dec.
7
Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.利用基础模型库进行跨设备肿瘤显微镜检查中的细胞相似性搜索。
Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.
8
Sexual Harassment and Prevention Training性骚扰与预防培训
9
PDF Entity Annotation Tool (PEAT).PDF实体注释工具(PEAT)。
J Open Source Softw. 2025 Apr 8;10(108):5336. doi: 10.21105/joss.05336.
10
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.