• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用监督式和非监督式机器学习评估组学数据标准化工具的工作流程。

Workflow for Evaluating Normalization Tools for Omics Data Using Supervised and Unsupervised Machine Learning.

机构信息

Department of Chemistry, University of Kansas, Lawrence, Kansas 66045, United States.

Department of Chemistry and Biochemistry and the Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio 43210, United States.

出版信息

J Am Soc Mass Spectrom. 2023 Dec 6;34(12):2775-2784. doi: 10.1021/jasms.3c00295. Epub 2023 Oct 28.

DOI:10.1021/jasms.3c00295
PMID:37897440
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10919320/
Abstract

To achieve high quality omics results, systematic variability in mass spectrometry (MS) data must be adequately addressed. Effective data normalization is essential for minimizing this variability. The abundance of approaches and the data-dependent nature of normalization have led some researchers to develop open-source academic software for choosing the best approach. While these tools are certainly beneficial to the community, none of them meet all of the needs of all users, particularly users who want to test new strategies that are not available in these products. Herein, we present a simple and straightforward workflow that facilitates the identification of optimal normalization strategies using straightforward evaluation metrics, employing both supervised and unsupervised machine learning. The workflow offers a "DIY" aspect, where the performance of any normalization strategy can be evaluated for any type of MS data. As a demonstration of its utility, we apply this workflow on two distinct datasets, an ESI-MS dataset of extracted lipids from latent fingerprints and a cancer spheroid dataset of metabolites ionized by MALDI-MSI, for which we identified the best-performing normalization strategies.

摘要

为了获得高质量的组学结果,必须充分解决质谱(MS)数据中的系统变异性。有效的数据归一化对于最小化这种变异性至关重要。由于归一化方法的多样性和数据依赖性,一些研究人员开发了用于选择最佳方法的开源学术软件。虽然这些工具对社区肯定是有益的,但它们都不能满足所有用户的所有需求,特别是那些希望测试新产品中没有的新策略的用户。在这里,我们提出了一个简单而直接的工作流程,使用简单的评估指标,通过有监督和无监督机器学习,方便地确定最佳归一化策略。该工作流程提供了一个“DIY”方面,任何归一化策略的性能都可以针对任何类型的 MS 数据进行评估。作为其效用的演示,我们将此工作流程应用于两个不同的数据集,一个是来自潜伏指纹的提取脂质的 ESI-MS 数据集,另一个是由 MALDI-MSI 电离的代谢物的癌症球体数据集,我们确定了表现最佳的归一化策略。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45b4/10919320/6de4d3e2f427/nihms-1965907-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45b4/10919320/10867f77f70d/nihms-1965907-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45b4/10919320/ac784baf67b6/nihms-1965907-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45b4/10919320/6de4d3e2f427/nihms-1965907-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45b4/10919320/10867f77f70d/nihms-1965907-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45b4/10919320/ac784baf67b6/nihms-1965907-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45b4/10919320/6de4d3e2f427/nihms-1965907-f0003.jpg

相似文献

1
Workflow for Evaluating Normalization Tools for Omics Data Using Supervised and Unsupervised Machine Learning.使用监督式和非监督式机器学习评估组学数据标准化工具的工作流程。
J Am Soc Mass Spectrom. 2023 Dec 6;34(12):2775-2784. doi: 10.1021/jasms.3c00295. Epub 2023 Oct 28.
2
Signal preprocessing, multivariate analysis and software tools for MA(LDI)-TOF mass spectrometry imaging for biological applications.用于生物应用的 MA(LDI)-TOF 质谱成像的信号预处理、多元分析和软件工具。
Mass Spectrom Rev. 2018 May;37(3):281-306. doi: 10.1002/mas.21527. Epub 2016 Nov 9.
3
Correlative mass spectrometry imaging, applying time-of-flight secondary ion mass spectrometry and atmospheric pressure matrix-assisted laser desorption/ionization to a single tissue section.相关质谱成像,即将飞行时间二次离子质谱和大气压基质辅助激光解吸/电离应用于单个组织切片。
Rapid Commun Mass Spectrom. 2018 Jan 30;32(2):159-166. doi: 10.1002/rcm.8022.
4
Elaboration Pipeline for the Management of MALDI-MS Imaging Datasets.基质辅助激光解吸电离飞行时间质谱成像数据集的处理流程。
Methods Mol Biol. 2021;2361:129-142. doi: 10.1007/978-1-0716-1641-3_8.
5
The Utility of Unsupervised Machine Learning in Anatomic Pathology.无监督机器学习在解剖病理学中的应用。
Am J Clin Pathol. 2022 Jan 6;157(1):5-14. doi: 10.1093/ajcp/aqab085.
6
Galaxy-M: a Galaxy workflow for processing and analyzing direct infusion and liquid chromatography mass spectrometry-based metabolomics data.Galaxy-M:一种用于处理和分析基于直接进样和液相色谱质谱联用的代谢组学数据的Galaxy工作流程。
Gigascience. 2016 Feb 23;5:10. doi: 10.1186/s13742-016-0115-8. eCollection 2016.
7
Aligning Post-Column ESI-MS, MALDI-MS, and Coagulation Bioassay Data of spp., , and Venoms Chromatographically to Assess MALDI-MS and ESI-MS Complementarity with Correlation of Bioactive Toxins to Mass Spectrometric Data.将 spp.、、和 毒液的柱后 ESI-MS、MALDI-MS 和凝血生物测定数据进行对齐,以评估 MALDI-MS 和 ESI-MS 的互补性,并将生物活性毒素与质谱数据相关联。
Toxins (Basel). 2024 Aug 29;16(9):379. doi: 10.3390/toxins16090379.
8
MS Imaging-Guided Microproteomics for Spatial Omics on a Single Instrument.基于 MS 的成像引导的微蛋白质组学用于单台仪器上的空间组学研究
Proteomics. 2020 Dec;20(23):e1900369. doi: 10.1002/pmic.201900369. Epub 2020 Aug 19.
9
Automatic Normalization of Anatomical Phrases in Radiology Reports Using Unsupervised Learning.使用无监督学习对放射学报告中的解剖短语进行自动归一化。
J Digit Imaging. 2019 Feb;32(1):6-18. doi: 10.1007/s10278-018-0116-5.
10
Unsupervised machine learning for exploratory data analysis in imaging mass spectrometry.无监督机器学习在成像质谱分析中的探索性数据分析。
Mass Spectrom Rev. 2020 May;39(3):245-291. doi: 10.1002/mas.21602. Epub 2019 Oct 11.

引用本文的文献

1
Multiplexed Quantification of First-Trimester Serum Biomarkers in Healthy Pregnancy.健康妊娠早期血清生物标志物的多重定量分析
Int J Mol Sci. 2025 Aug 18;26(16):7970. doi: 10.3390/ijms26167970.
2
Machine Learning Framework for Ovarian Cancer Diagnostics Using Plasma Lipidomics and Metabolomics.基于血浆脂质组学和代谢组学的卵巢癌诊断机器学习框架
Int J Mol Sci. 2025 Jul 10;26(14):6630. doi: 10.3390/ijms26146630.
3
Evaluation of normalization strategies for mass spectrometry-based multi-omics datasets.基于质谱的多组学数据集标准化策略的评估

本文引用的文献

1
Integrative analysis of multimodal mass spectrometry data in MZmine 3.在MZmine 3中对多模态质谱数据进行综合分析。
Nat Biotechnol. 2023 Apr;41(4):447-449. doi: 10.1038/s41587-023-01690-2.
2
Lipidomic comparison of 2D and 3D colon cancer cell culture models.二维和三维结肠癌细胞培养模型的脂质组学比较。
J Mass Spectrom. 2022 Aug;57(8):e4880. doi: 10.1002/jms.4880.
3
How (Not) to Generate a Highly Predictive Biomarker Panel Using Machine Learning.如何(不)使用机器学习生成高度可预测的生物标志物面板。
Metabolomics. 2025 Jul 1;21(4):98. doi: 10.1007/s11306-025-02297-1.
4
Exploring Sample Storage Conditions for the Mass Spectrometric Analysis of Extracted Lipids from Latent Fingerprints.探索用于潜指纹提取脂质质谱分析的样本储存条件。
Biomolecules. 2025 Mar 25;15(4):477. doi: 10.3390/biom15040477.
5
Groomed Fingerprint Sebum Sampling: Reproducibility and Variability According to Anatomical Collection Region and Biological Sex.修饰指纹皮脂采样:根据解剖采集区域和生物性别分析的可重复性和变异性
Molecules. 2025 Feb 6;30(3):726. doi: 10.3390/molecules30030726.
6
Skin Surface Sebum Analysis by ESI-MS.利用电喷雾质谱法进行皮肤表面皮脂分析。
Biomolecules. 2024 Jul 3;14(7):790. doi: 10.3390/biom14070790.
J Proteome Res. 2022 Sep 2;21(9):2071-2074. doi: 10.1021/acs.jproteome.2c00117. Epub 2022 Aug 25.
4
Improved Discrimination of Disease States Using Proteomics Data with the Updated Aristotle Classifier.使用经过更新的 Aristotle 分类器的蛋白质组学数据提高疾病状态的区分能力。
J Proteome Res. 2021 May 7;20(5):2823-2829. doi: 10.1021/acs.jproteome.1c00066. Epub 2021 Apr 28.
5
: batch effect adjustment for RNA-seq count data.RNA测序计数数据的批次效应调整
NAR Genom Bioinform. 2020 Sep;2(3):lqaa078. doi: 10.1093/nargab/lqaa078. Epub 2020 Sep 21.
6
Systematic Evaluation of Normalization Methods for Glycomics Data Based on Performance of Network Inference.基于网络推理性能的糖组学数据标准化方法的系统评估
Metabolites. 2020 Jul 2;10(7):271. doi: 10.3390/metabo10070271.
7
How to Apply Supervised Machine Learning Tools to MS Imaging Files: Case Study with Cancer Spheroids Undergoing Treatment with the Monoclonal Antibody Cetuximab.如何将监督机器学习工具应用于 MS 成像文件:以接受单克隆抗体西妥昔单抗治疗的癌症球体为例的研究。
J Am Soc Mass Spectrom. 2020 Jul 1;31(7):1350-1357. doi: 10.1021/jasms.0c00010. Epub 2020 Jun 10.
8
NOREVA: enhanced normalization and evaluation of time-course and multi-class metabolomic data.NOREVA:时间进程和多类代谢组学数据的增强标准化和评估。
Nucleic Acids Res. 2020 Jul 2;48(W1):W436-W448. doi: 10.1093/nar/gkaa258.
9
WaveICA: A novel algorithm to remove batch effects for large-scale untargeted metabolomics data based on wavelet analysis.WaveICA:一种基于小波分析的新型算法,用于去除大规模无靶向代谢组学数据中的批次效应。
Anal Chim Acta. 2019 Jul 11;1061:60-69. doi: 10.1016/j.aca.2019.02.010. Epub 2019 Feb 19.
10
NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis.NormalyzerDE:一种用于改善组学表达数据标准化和高灵敏度差异表达分析的在线工具。
J Proteome Res. 2019 Feb 1;18(2):732-740. doi: 10.1021/acs.jproteome.8b00523. Epub 2018 Oct 15.