Suppr超能文献

使用 NOREVA 优化代谢组学数据处理。

Optimization of metabolomic data processing using NOREVA.

机构信息

College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.

Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, China.

出版信息

Nat Protoc. 2022 Jan;17(1):129-151. doi: 10.1038/s41596-021-00636-9. Epub 2021 Dec 24.

Abstract

A typical output of a metabolomic experiment is a peak table corresponding to the intensity of measured signals. Peak table processing, an essential procedure in metabolomics, is characterized by its study dependency and combinatorial diversity. While various methods and tools have been developed to facilitate metabolomic data processing, it is challenging to determine which processing workflow will give good performance for a specific metabolomic study. NOREVA, an out-of-the-box protocol, was therefore developed to meet this challenge. First, the peak table is subjected to many processing workflows that consist of three to five defined calculations in combinatorially determined sequences. Second, the results of each workflow are judged against objective performance criteria. Third, various benchmarks are analyzed to highlight the uniqueness of this newly developed protocol in (1) evaluating the processing performance based on multiple criteria, (2) optimizing data processing by scanning thousands of workflows, and (3) allowing data processing for time-course and multiclass metabolomics. This protocol is implemented in an R package for convenient accessibility and to protect users' data privacy. Preliminary experience in R language would facilitate the usage of this protocol, and the execution time may vary from several minutes to a couple of hours depending on the size of the analyzed data.

摘要

代谢组学实验的典型输出是一个对应于测量信号强度的峰表。峰表处理是代谢组学中的一个基本步骤,其特点是具有研究依赖性和组合多样性。虽然已经开发了各种方法和工具来促进代谢组学数据处理,但确定哪种处理工作流程将为特定的代谢组学研究提供良好的性能是具有挑战性的。因此,开发了一种即开即用的协议(NOREVA)来应对这一挑战。首先,将峰表提交给许多处理工作流程,这些工作流程由三个到五个在组合上确定的序列中定义的计算组成。其次,根据客观性能标准来判断每个工作流程的结果。第三,通过分析各种基准来突出这个新开发的协议的独特性,包括:(1) 根据多个标准评估处理性能,(2) 通过扫描数千个工作流程来优化数据处理,以及 (3) 允许对时间序列和多类代谢组学进行数据处理。该协议在 R 包中实现,便于访问,并保护用户的数据隐私。在 R 语言方面有初步经验将有助于使用该协议,执行时间可能会根据分析数据的大小从几分钟到几个小时不等。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验