用多向数据分析探索动态代谢组学数据：一项模拟研究。

Exploring dynamic metabolomics data with multiway data analysis: a simulation study.

机构信息

Machine Intelligence Department, Simula Metropolitan Center for Digital Engineering, Oslo, Norway.

Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, The Netherlands.

出版信息

BMC Bioinformatics. 2022 Jan 10;23(1):31. doi: 10.1186/s12859-021-04550-5.

DOI:10.1186/s12859-021-04550-5

PMID:35012453

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8750750/

Abstract

BACKGROUND

Analysis of dynamic metabolomics data holds the promise to improve our understanding of underlying mechanisms in metabolism. For example, it may detect changes in metabolism due to the onset of a disease. Dynamic or time-resolved metabolomics data can be arranged as a three-way array with entries organized according to a subjects mode, a metabolites mode and a time mode. While such time-evolving multiway data sets are increasingly collected, revealing the underlying mechanisms and their dynamics from such data remains challenging. For such data, one of the complexities is the presence of a superposition of several sources of variation: induced variation (due to experimental conditions or inborn errors), individual variation, and measurement error. Multiway data analysis (also known as tensor factorizations) has been successfully used in data mining to find the underlying patterns in multiway data. To explore the performance of multiway data analysis methods in terms of revealing the underlying mechanisms in dynamic metabolomics data, simulated data with known ground truth can be studied.

RESULTS

We focus on simulated data arising from different dynamic models of increasing complexity, i.e., a simple linear system, a yeast glycolysis model, and a human cholesterol model. We generate data with induced variation as well as individual variation. Systematic experiments are performed to demonstrate the advantages and limitations of multiway data analysis in analyzing such dynamic metabolomics data and their capacity to disentangle the different sources of variations. We choose to use simulations since we want to understand the capability of multiway data analysis methods which is facilitated by knowing the ground truth.

CONCLUSION

Our numerical experiments demonstrate that despite the increasing complexity of the studied dynamic metabolic models, tensor factorization methods CANDECOMP/PARAFAC(CP) and Parallel Profiles with Linear Dependences (Paralind) can disentangle the sources of variations and thereby reveal the underlying mechanisms and their dynamics.

摘要

背景

分析动态代谢组学数据有望增进我们对代谢中潜在机制的理解。例如，它可以检测由于疾病发作而导致的代谢变化。动态或时间分辨代谢组学数据可以排列为具有三向阵列的形式，其中的条目根据主体模式、代谢物模式和时间模式进行组织。虽然这种时间演变的多向数据集越来越多地被收集，但从这些数据中揭示潜在机制及其动态仍然具有挑战性。对于这种数据，其中一个复杂性是存在多种来源的变化的叠加：诱导变化（由于实验条件或先天错误）、个体变化和测量误差。多向数据分析（也称为张量分解）已成功用于数据挖掘，以找到多向数据中的潜在模式。为了探索多向数据分析方法在揭示动态代谢组学数据中的潜在机制方面的性能，可以研究具有已知真实情况的模拟数据。

结果

我们专注于来自不同动态模型的模拟数据，这些模型的复杂性递增，即简单的线性系统、酵母糖酵解模型和人类胆固醇模型。我们生成具有诱导变化和个体变化的数据。系统实验旨在展示多向数据分析在分析这种动态代谢组学数据方面的优势和局限性及其分离不同变化源的能力。我们选择使用模拟数据，因为我们希望了解多向数据分析方法的能力，这得益于对真实情况的了解。

结论

我们的数值实验表明，尽管所研究的动态代谢模型的复杂性不断增加，但张量分解方法 CANDECOMP/PARAFAC(CP)和具有线性依赖关系的并行剖面(Paralind)可以分离变化源，从而揭示潜在机制及其动态。

相似文献

Exploring dynamic metabolomics data with multiway data analysis: a simulation study.用多向数据分析探索动态代谢组学数据：一项模拟研究。

BMC Bioinformatics. 2022 Jan 10;23(1):31. doi: 10.1186/s12859-021-04550-5.

Analyzing postprandial metabolomics data using multiway models: a simulation study.运用多向模型分析餐后代谢组学数据：一项模拟研究。

BMC Bioinformatics. 2024 Mar 4;25(1):94. doi: 10.1186/s12859-024-05686-w.

Characterizing human postprandial metabolic response using multiway data analysis.采用多向数据分析方法描述人体餐后代谢反应。

Metabolomics. 2024 May 9;20(3):50. doi: 10.1007/s11306-024-02109-y.

Revealing static and dynamic biomarkers from postprandial metabolomics data through coupled matrix and tensor factorizations.通过耦合矩阵和张量分解揭示餐后代谢组学数据中的静态和动态生物标志物。

Metabolomics. 2024 Jul 27;20(4):86. doi: 10.1007/s11306-024-02128-9.

Bayesian Nonparametric Models for Multiway Data Analysis.贝叶斯非参数模型在多向数据分析中的应用。

IEEE Trans Pattern Anal Mach Intell. 2015 Feb;37(2):475-87. doi: 10.1109/TPAMI.2013.201.

CP Tensor Decomposition with Cannot-Link Intermode Constraints.具有不可链接模式间约束的CP张量分解

Proc SIAM Int Conf Data Min. 2019 May;2019:711-719. doi: 10.1137/1.9781611975673.80.

Supervised multiway factorization.监督式多路分解

Electron J Stat. 2018;12(1):1150-1180. doi: 10.1214/18-EJS1421. Epub 2018 Mar 27.

Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学：基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍

Bayesian Robust Tensor Factorization for Incomplete Multiway Data.贝叶斯稳健张量分解在不完全多路数据中的应用。

IEEE Trans Neural Netw Learn Syst. 2016 Apr;27(4):736-48. doi: 10.1109/TNNLS.2015.2423694. Epub 2015 Jun 9.

Tracing Evolving Networks Using Tensor Factorizations vs. ICA-Based Approaches.使用张量分解与基于独立成分分析的方法追踪不断演变的网络

Front Neurosci. 2022 Apr 25;16:861402. doi: 10.3389/fnins.2022.861402. eCollection 2022.

引用本文的文献

Longitudinal Metabolomics Data Analysis Informed by Mechanistic Models.基于机理模型的纵向代谢组学数据分析

Metabolites. 2024 Dec 24;15(1):2. doi: 10.3390/metabo15010002.

Characterizing human postprandial metabolic response using multiway data analysis.采用多向数据分析方法描述人体餐后代谢反应。

Metabolomics. 2024 May 9;20(3):50. doi: 10.1007/s11306-024-02109-y.

Analyzing postprandial metabolomics data using multiway models: a simulation study.运用多向模型分析餐后代谢组学数据：一项模拟研究。

BMC Bioinformatics. 2024 Mar 4;25(1):94. doi: 10.1186/s12859-024-05686-w.

Data-driven analysis and prediction of dynamic postprandial metabolic response to multiple dietary challenges using dynamic mode decomposition.使用动态模式分解对多种饮食挑战的动态餐后代谢反应进行数据驱动的分析和预测。

Front Nutr. 2024 Jan 12;10:1304540. doi: 10.3389/fnut.2023.1304540. eCollection 2023.

Discrimination of missing data types in metabolomics data based on particle swarm optimization algorithm and XGBoost model.基于粒子群优化算法和 XGBoost 模型的代谢组学数据缺失类型判别。

Sci Rep. 2024 Jan 2;14(1):152. doi: 10.1038/s41598-023-50646-8.

Tracing Evolving Networks Using Tensor Factorizations vs. ICA-Based Approaches.使用张量分解与基于独立成分分析的方法追踪不断演变的网络

Front Neurosci. 2022 Apr 25;16:861402. doi: 10.3389/fnins.2022.861402. eCollection 2022.

本文引用的文献

LogPar: Logistic PARAFAC2 Factorization for Temporal Binary Data with Missing Values.LogPar：用于处理带有缺失值的时态二元数据的逻辑PARAFAC2分解

KDD. 2020 Aug;2020:1625-1635. doi: 10.1145/3394486.3403213.

Discovering Temporal Patterns in Longitudinal Nontargeted Metabolomics Data via Group and Nuclear Norm Regularized Multivariate Regression.通过组和核范数正则化多元回归发现纵向非靶向代谢组学数据中的时间模式。

Metabolites. 2020 Jan 13;10(1):33. doi: 10.3390/metabo10010033.

Integrative analysis of time course metabolic data and biomarker discovery.时间序列代谢数据的综合分析和生物标志物的发现。

BMC Bioinformatics. 2020 Jan 9;21(1):11. doi: 10.1186/s12859-019-3333-0.

Multi-parameter comparison of a standardized mixed meal tolerance test in healthy and type 2 diabetic subjects: the PhenFlex challenge.健康受试者与2型糖尿病受试者标准化混合餐耐量试验的多参数比较：PhenFlex挑战

Genes Nutr. 2017 Aug 29;12:21. doi: 10.1186/s12263-017-0570-6. eCollection 2017.

A wellness study of 108 individuals using personal, dense, dynamic data clouds.一项针对108名个体的健康研究，使用个人、密集、动态的数据云。

Nat Biotechnol. 2017 Aug;35(8):747-756. doi: 10.1038/nbt.3870. Epub 2017 Jul 17.

Turbo-SMT: Accelerating Coupled Sparse Matrix-Tensor Factorizations by 200×.Turbo-SMT：将耦合稀疏矩阵-张量分解加速200倍。

Proc SIAM Int Conf Data Min. 2014;2014:118-126. doi: 10.1137/1.9781611973440.14.

Lost in transition: start-up of glycolysis yields subpopulations of nongrowing cells.代谢转换中迷失：糖酵解的启动产生非生长细胞亚群。

Science. 2014 Feb 28;343(6174):1245114. doi: 10.1126/science.1245114. Epub 2014 Jan 16.

A physiologically based in silico kinetic model predicting plasma cholesterol concentrations in humans.一种基于生理学的计算机内动力学模型，可预测人体血浆胆固醇浓度。

J Lipid Res. 2012 Dec;53(12):2734-46. doi: 10.1194/jlr.M031930. Epub 2012 Sep 29.

Plasma metabolomics and proteomics profiling after a postprandial challenge reveal subtle diet effects on human metabolic status.餐后激发试验后的血浆代谢组学和蛋白质组学分析揭示了饮食对人体代谢状态的细微影响。

Metabolomics. 2012 Apr;8(2):347-359. doi: 10.1007/s11306-011-0320-5. Epub 2011 May 28.

New figures of merit for comprehensive functional genomics data: the metabolomics case.新的综合功能基因组学数据评价指标：代谢组学案例。

Anal Chem. 2011 May 1;83(9):3267-74. doi: 10.1021/ac102374c. Epub 2011 Mar 10.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用多向数据分析探索动态代谢组学数据：一项模拟研究。

Exploring dynamic metabolomics data with multiway data analysis: a simulation study.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献