Pohjanen Elin, Thysell Elin, Jonsson Pär, Eklund Caroline, Silfver Anders, Carlsson Inga-Britt, Lundgren Krister, Moritz Thomas, Svensson Michael B, Antti Henrik
Research Group for Chemometrics, Department of Chemistry, Umeå University, SE-901 87 Umeå, Sweden.
J Proteome Res. 2007 Jun;6(6):2113-20. doi: 10.1021/pr070007g. Epub 2007 Apr 12.
A novel hypothesis-free multivariate screening methodology for the study of human exercise metabolism in blood serum is presented. Serum gas chromatography/time-of-flight mass spectrometry (GC/TOFMS) data was processed using hierarchical multivariate curve resolution (H-MCR), and orthogonal partial least-squares discriminant analysis (OPLS-DA) was used to model the systematic variation related to the acute effect of strenuous exercise. Potential metabolic biomarkers were identified using data base comparisons. Extensive validation was carried out including predictive H-MCR, 7-fold full cross-validation, and predictions for the OPLS-DA model, variable permutation for highlighting interesting metabolites, and pairwise t tests for examining the significance of metabolites. The concentration changes of potential biomarkers were verified in the raw GC/TOFMS data. In total, 420 potential metabolites were resolved in the serum samples. On the basis of the relative concentrations of the 420 resolved metabolites, a valid multivariate model for the difference between pre- and post-exercise subjects was obtained. A total of 34 metabolites were highlighted as potential biomarkers, all statistically significant (p < 8.1E-05). As an example, two potential markers were identified as glycerol and asparagine. The concentration changes for these two metabolites were also verified in the raw GC/TOFMS data. The strategy was shown to facilitate interpretation and validation of metabolic interactions in human serum as well as revealing the identity of potential markers for known or novel mechanisms of human exercise physiology. The multivariate way of addressing metabolism studies can help to increase the understanding of the integrative biology behind, as well as unravel new mechanistic explanations in relation to, exercise physiology.
本文提出了一种用于研究人血清运动代谢的新型无假设多变量筛选方法。血清气相色谱/飞行时间质谱(GC/TOFMS)数据采用分层多变量曲线分辨率(H-MCR)进行处理,并用正交偏最小二乘判别分析(OPLS-DA)对与剧烈运动急性效应相关的系统变化进行建模。通过数据库比较识别潜在的代谢生物标志物。进行了广泛的验证,包括预测性H-MCR、7倍完全交叉验证、OPLS-DA模型预测、用于突出有趣代谢物的变量置换以及用于检验代谢物显著性的成对t检验。在原始GC/TOFMS数据中验证了潜在生物标志物的浓度变化。血清样本中共解析出420种潜在代谢物。基于420种已解析代谢物的相对浓度,获得了运动前后受试者差异的有效多变量模型。共有34种代谢物被突出为潜在生物标志物,均具有统计学显著性(p < 8.1E-05)。例如,两种潜在标志物被鉴定为甘油和天冬酰胺。这两种代谢物的浓度变化也在原始GC/TOFMS数据中得到了验证。该策略被证明有助于解释和验证人血清中的代谢相互作用,以及揭示人类运动生理学已知或新机制的潜在标志物的身份。解决代谢研究的多变量方法有助于增进对其背后整合生物学的理解,并揭示与运动生理学相关的新机制解释。