Timmerman Marieke E, Hoefsloot Huub C J, Smilde Age K, Ceulemans Eva
University of Groningen, Grote Kruisstraat 2/1, 9712TS Groningen, The Netherlands.
Biosystems Data Analysis, Faculty of Sciences, University of Amsterdam, Amsterdam, The Netherlands.
Metabolomics. 2015;11(5):1265-1276. doi: 10.1007/s11306-015-0785-8. Epub 2015 Feb 14.
In omics research often high-dimensional data is collected according to an experimental design. Typically, the manipulations involved yield differential effects on subsets of variables. An effective approach to identify those effects is ANOVA-simultaneous component analysis (ASCA), which combines analysis of variance with principal component analysis. So far, pre-treatment in ASCA received hardly any attention, whereas its effects can be huge. In this paper, we describe various strategies for scaling, and identify a rational approach. We present the approaches in matrix algebra terms and illustrate them with an insightful simulated example. We show that scaling directly influences which data aspects are stressed in the analysis, and hence become apparent in the solution. Therefore, the cornerstone for proper scaling is to use a scaling factor that is free from the effect of interest. This implies that proper scaling depends on the effect(s) of interest, and that different types of scaling may be proper for the different effect matrices. We illustrate that different scaling approaches can greatly affect the ASCA interpretation with a real-life example from nutritional research. The principle that scaling factors should be free from the effect of interest generalizes to other statistical methods that involve scaling, as classification methods.
在组学研究中,通常会根据实验设计收集高维数据。一般来说,所涉及的操作会对变量子集产生不同的影响。一种识别这些影响的有效方法是方差分析 - 同时成分分析(ASCA),它将方差分析与主成分分析相结合。到目前为止,ASCA 中的预处理几乎没有受到任何关注,但其影响可能非常大。在本文中,我们描述了各种缩放策略,并确定了一种合理的方法。我们用矩阵代数术语展示这些方法,并用一个有启发性的模拟示例进行说明。我们表明,缩放直接影响分析中强调的数据方面,从而在解中变得明显。因此,正确缩放的基石是使用不受感兴趣效应影响的缩放因子。这意味着正确的缩放取决于感兴趣的效应,并且不同类型的缩放可能适用于不同的效应矩阵。我们用营养研究中的一个实际例子说明不同的缩放方法会极大地影响 ASCA 的解释。缩放因子应不受感兴趣效应影响的原则也适用于其他涉及缩放的统计方法,如分类方法。