Lomonosov Moscow State University, Chemistry Department, 119992, GSP-2, Lenin Hills, 1b3, Moscow, Russia.
Anal Methods. 2020 Jul 28;12(28):3582-3591. doi: 10.1039/d0ay00204f. Epub 2020 Jul 8.
The data processing workflow for LC-MS based metabolomics study is suggested with signal drift correction, univariate analysis, supervised learning, feature selection and unsupervised modelling. The proposed approach requires only an annotation-free peak table and produces an extremely reduced set of the most relevant features together with validation via Receiver Operating Characteristic analysis for selected predictors, cross-validation and unsupervised projection. The presented study was initially optimised by its own experimental set and then was successfully tested by using 36 datasets from 21 publicly available metabolomics projects. The suggested workflow can be used for classification purposes in high dimensional metabolomics studies and as a first step in exploratory analysis, data projection, biomarker selection, data integration and fusion.
基于 LC-MS 的代谢组学研究的数据处理工作流程建议进行信号漂移校正、单变量分析、监督学习、特征选择和无监督建模。该方法仅需要一个无注释的峰表,并通过接收器操作特征分析为选定的预测因子、交叉验证和无监督投影生成最相关特征的极其简化集。该研究最初通过其自身的实验集进行了优化,然后成功地使用来自 21 个公共代谢组学项目的 36 个数据集进行了测试。所建议的工作流程可用于高维代谢组学研究中的分类目的,也可作为探索性分析、数据投影、生物标志物选择、数据集成和融合的第一步。