Suppr超能文献

使用监督式和非监督式机器学习评估组学数据标准化工具的工作流程。

Workflow for Evaluating Normalization Tools for Omics Data Using Supervised and Unsupervised Machine Learning.

机构信息

Department of Chemistry, University of Kansas, Lawrence, Kansas 66045, United States.

Department of Chemistry and Biochemistry and the Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio 43210, United States.

出版信息

J Am Soc Mass Spectrom. 2023 Dec 6;34(12):2775-2784. doi: 10.1021/jasms.3c00295. Epub 2023 Oct 28.

Abstract

To achieve high quality omics results, systematic variability in mass spectrometry (MS) data must be adequately addressed. Effective data normalization is essential for minimizing this variability. The abundance of approaches and the data-dependent nature of normalization have led some researchers to develop open-source academic software for choosing the best approach. While these tools are certainly beneficial to the community, none of them meet all of the needs of all users, particularly users who want to test new strategies that are not available in these products. Herein, we present a simple and straightforward workflow that facilitates the identification of optimal normalization strategies using straightforward evaluation metrics, employing both supervised and unsupervised machine learning. The workflow offers a "DIY" aspect, where the performance of any normalization strategy can be evaluated for any type of MS data. As a demonstration of its utility, we apply this workflow on two distinct datasets, an ESI-MS dataset of extracted lipids from latent fingerprints and a cancer spheroid dataset of metabolites ionized by MALDI-MSI, for which we identified the best-performing normalization strategies.

摘要

为了获得高质量的组学结果,必须充分解决质谱(MS)数据中的系统变异性。有效的数据归一化对于最小化这种变异性至关重要。由于归一化方法的多样性和数据依赖性,一些研究人员开发了用于选择最佳方法的开源学术软件。虽然这些工具对社区肯定是有益的,但它们都不能满足所有用户的所有需求,特别是那些希望测试新产品中没有的新策略的用户。在这里,我们提出了一个简单而直接的工作流程,使用简单的评估指标,通过有监督和无监督机器学习,方便地确定最佳归一化策略。该工作流程提供了一个“DIY”方面,任何归一化策略的性能都可以针对任何类型的 MS 数据进行评估。作为其效用的演示,我们将此工作流程应用于两个不同的数据集,一个是来自潜伏指纹的提取脂质的 ESI-MS 数据集,另一个是由 MALDI-MSI 电离的代谢物的癌症球体数据集,我们确定了表现最佳的归一化策略。

相似文献

本文引用的文献

5
: batch effect adjustment for RNA-seq count data.RNA测序计数数据的批次效应调整
NAR Genom Bioinform. 2020 Sep;2(3):lqaa078. doi: 10.1093/nargab/lqaa078. Epub 2020 Sep 21.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验