Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology., C. Doctor Aiguader 88, Edif. PRBB, 08003, Barcelona, Spain.
BarcelonaBeta Brain Research Center (BBRC), Pasqual Maragall Foundation, Barcelona, Spain.
Neuroinformatics. 2019 Oct;17(4):583-592. doi: 10.1007/s12021-019-09416-z.
Multivariate methods have the potential to better capture complex relationships that may exist between different biological levels. Multiple Factor Analysis (MFA) is one of the most popular methods to obtain factor scores and measures of discrepancy between data sets. However, singular value decomposition in MFA is based on PCA, which is adequate only if the data is normally distributed, linear or stationary. In addition, including strongly correlated variables can overemphasize the contribution of the estimated components. In this work, we introduced a novel method referred as Independent Multifactorial Analysis (ICA-MFA) to derive relevant features from multiscale data. This method is an extended implementation of MFA, where the component value decomposition is based on Independent Component Analysis. In addition, ICA-MFA incorporates a predictive step based on an Independent Component Regression. We evaluated and compared the performance of ICA-MFA with both, the MFA method and traditional univariate analyses, in a simulation study. We showed how ICA-MFA explained up to 10-fold more variance than MFA and univariate methods. We applied the proposed algorithm in a study of 4057 individuals belonging to the population-based Rotterdam Study with available genetic and neuroimaging data, as well as information about executive cognitive functioning. Specifically, we used ICA-MFA to detect relevant genetic features related to structural brain regions, which in turn were involved, in the mechanisms of executive cognitive function. The proposed strategy makes it possible to determine the degree to which the whole set of genetic and/or neuroimaging markers contribute to the variability of the symptomatology jointly, rather than individually. While univariate results and MFA combinations only explained a limited proportion of variance (less than 2%), our method increased the explained variance (10%) and allowed the identification of significant components that maximize the variance explained in the model. The potential application of the ICA-MFA algorithm constitutes an important aspect of integrating multivariate multiscale data, specifically in the field of Neurogenetics.
多元方法有可能更好地捕捉不同生物学层次之间可能存在的复杂关系。多元因子分析(MFA)是获取因子得分和数据集之间差异度量的最流行方法之一。然而,MFA 中的奇异值分解基于 PCA,只有在数据正态分布、线性或平稳时才适用。此外,包含强相关变量会过度强调估计组件的贡献。在这项工作中,我们引入了一种新的方法,称为独立多因子分析(ICA-MFA),用于从多尺度数据中提取相关特征。该方法是 MFA 的扩展实现,其中组件值分解基于独立成分分析。此外,ICA-MFA 结合了基于独立成分回归的预测步骤。我们在模拟研究中评估并比较了 ICA-MFA 与 MFA 方法和传统单变量分析的性能。我们表明,ICA-MFA 可以解释高达 10 倍的方差,而 MFA 和单变量方法只能解释不到 10 倍的方差。我们将所提出的算法应用于 4057 名个体的研究中,这些个体属于基于人群的鹿特丹研究,具有可用的遗传和神经影像学数据,以及关于执行认知功能的信息。具体来说,我们使用 ICA-MFA 来检测与结构性脑区相关的相关遗传特征,这些脑区反过来又参与了执行认知功能的机制。所提出的策略使得可以确定整个遗传和/或神经影像学标记集合共同贡献症状变异性的程度,而不是单独贡献。虽然单变量结果和 MFA 组合仅解释了有限的方差(小于 2%),但我们的方法增加了可解释的方差(10%),并允许识别出最大程度地解释模型中方差的显著组件。ICA-MFA 算法的潜在应用是整合多元多尺度数据的一个重要方面,特别是在神经遗传学领域。