Thysell Elin, Chorell Elin, Svensson Michael B, Jonsson Pär, Antti Henrik
Department of Chemistry, Computational Life Science Cluster (CLiC), Umeå University, SE-901 87 Umeå, Sweden.
Department of Public Health and Clinical Medicine, Umeå University, SE-901 87 Umeå, Sweden.
Metabolites. 2012 Oct 31;2(4):796-817. doi: 10.3390/metabo2040796.
The suggested approach makes it feasible to screen large metabolomics data, sample sets with retained data quality or to retrieve significant metabolic information from small sample sets that can be verified over multiple studies. Hierarchical multivariate curve resolution (H-MCR), followed by orthogonal partial least squares discriminant analysis (OPLS-DA) was used for processing and classification of gas chromatography/time of flight mass spectrometry (GC/TOFMS) data characterizing human serum samples collected in a study of strenuous physical exercise. The efficiency of predictive H-MCR processing of representative sample subsets, selected by chemometric approaches, for generating high quality data was proven. Extensive model validation by means of cross-validation and external predictions verified the robustness of the extracted metabolite patterns in the data. Comparisons of extracted metabolite patterns between models emphasized the reliability of the methodology in a biological information context. Furthermore, the high predictive power in longitudinal data provided proof for the potential use in clinical diagnosis. Finally, the predictive metabolite pattern was interpreted physiologically, highlighting the biological relevance of the diagnostic pattern.
所建议的方法使得筛选大型代谢组学数据、保持数据质量的样本集或从小样本集中检索可在多项研究中得到验证的重要代谢信息成为可能。在一项剧烈体育锻炼研究中,采用分层多元曲线分辨法(H-MCR),随后进行正交偏最小二乘判别分析(OPLS-DA),对表征人类血清样本的气相色谱/飞行时间质谱(GC/TOFMS)数据进行处理和分类。通过化学计量学方法选择的代表性样本子集的预测性H-MCR处理生成高质量数据的效率得到了验证。通过交叉验证和外部预测进行的广泛模型验证证实了数据中提取的代谢物模式的稳健性。模型之间提取的代谢物模式比较强调了该方法在生物信息背景下的可靠性。此外,纵向数据中的高预测能力为其在临床诊断中的潜在应用提供了证据。最后,对预测性代谢物模式进行了生理学解释,突出了诊断模式的生物学相关性。