一种具有统一架构的端到端质谱数据分类模型。

An end-to-end mass spectrometry data classification model with a unified architecture.

作者信息

Wang Yinchu, Zhang Wei, Guo Lin, Zhang Fengyi, Liu Zilong, Xiong Xingchuang, Fang Xiang

机构信息

Center for Metrology Scientific Data and Energy Metrology, National Institute of Metrology, Beijing, 100029, China.

Key Laboratory of Metrology Digitalization and Digital Metrology for State Market Regulation, Department, Beijing, 100029, China.

出版信息

Sci Rep. 2025 May 30;15(1):19065. doi: 10.1038/s41598-025-03741-x.

DOI:10.1038/s41598-025-03741-x

PMID:40447698

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12125398/

Abstract

Mass spectrometry, known for its high sensitivity, selectivity, rich structural information, and rapid analysis capabilities, is widely used in disease diagnosis and bioanalysis. Despite progress in classification methods/tools for data collection in the past decade, problems such as complex data processing, weak model characterization, and large interbatch differences persist. To address these problems, we present MS-DREDFeaMiC, a deep neural network framework for disease diagnosis and bioanalysis via mass spectrometry data that enables end-to-end training. The trained MS-DREDFeaMiC can integrate mixed features, reduce interbatch differences, and enhance feature distinctions among categories. To demonstrate its wide applicability, ten comparative experiments were conducted with seven public datasets and one self-constructed dataset, and MS-DREDFeaMiC yielded state-of-the-art results. MS-DREDFeaMiC achieved average accuracies that were 6.6% and 6.3% higher than those of Transformer and Mamba, respectively. We anticipate that MS-DREDFeaMiC can be directly applied to routine disease diagnosis and that any mass spectrometry-based classification studies can benefit from such an end-to-end trained model.

摘要

质谱分析法以其高灵敏度、高选择性、丰富的结构信息和快速分析能力而闻名，广泛应用于疾病诊断和生物分析领域。尽管在过去十年中数据收集的分类方法和工具取得了进展，但复杂的数据处理、模型表征能力薄弱以及批次间差异大等问题仍然存在。为了解决这些问题，我们提出了MS-DREDFeaMiC，这是一种通过质谱数据进行疾病诊断和生物分析的深度神经网络框架，能够进行端到端训练。经过训练的MS-DREDFeaMiC可以整合混合特征，减少批次间差异，并增强类别之间的特征区分度。为了证明其广泛的适用性，我们使用七个公共数据集和一个自建数据集进行了十次对比实验，MS-DREDFeaMiC取得了领先的结果。MS-DREDFeaMiC的平均准确率分别比Transformer和Mamba高6.6%和6.3%。我们预计MS-DREDFeaMiC可以直接应用于常规疾病诊断，并且任何基于质谱的分类研究都可以从这样一个经过端到端训练的模型中受益。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

一种具有统一架构的端到端质谱数据分类模型。

An end-to-end mass spectrometry data classification model with a unified architecture.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

一种具有统一架构的端到端质谱数据分类模型。

An end-to-end mass spectrometry data classification model with a unified architecture.

作者信息

机构信息

出版信息

相似文献

本文引用的文献