Centre for Medical Image Computing, Department of Computer Science, University College London, London, UK; Max Planck University College London Centre for Computational Psychiatry and Ageing Research, University College London, UK.
Centre for Medical Image Computing, Department of Computer Science, University College London, London, UK; Max Planck University College London Centre for Computational Psychiatry and Ageing Research, University College London, UK.
Neuroimage. 2022 Apr 1;249:118854. doi: 10.1016/j.neuroimage.2021.118854. Epub 2021 Dec 29.
Canonical Correlation Analysis (CCA) and its regularised versions have been widely used in the neuroimaging community to uncover multivariate associations between two data modalities (e.g., brain imaging and behaviour). However, these methods have inherent limitations: (1) statistical inferences about the associations are often not robust; (2) the associations within each data modality are not modelled; (3) missing values need to be imputed or removed. Group Factor Analysis (GFA) is a hierarchical model that addresses the first two limitations by providing Bayesian inference and modelling modality-specific associations. Here, we propose an extension of GFA that handles missing data, and highlight that GFA can be used as a predictive model. We applied GFA to synthetic and real data consisting of brain connectivity and non-imaging measures from the Human Connectome Project (HCP). In synthetic data, GFA uncovered the underlying shared and specific factors and predicted correctly the non-observed data modalities in complete and incomplete data sets. In the HCP data, we identified four relevant shared factors, capturing associations between mood, alcohol and drug use, cognition, demographics and psychopathological measures and the default mode, frontoparietal control, dorsal and ventral networks and insula, as well as two factors describing associations within brain connectivity. In addition, GFA predicted a set of non-imaging measures from brain connectivity. These findings were consistent in complete and incomplete data sets, and replicated previous findings in the literature. GFA is a promising tool that can be used to uncover associations between and within multiple data modalities in benchmark datasets (such as, HCP), and easily extended to more complex models to solve more challenging tasks.
典型相关分析(CCA)及其正则化版本已被神经影像学界广泛用于揭示两种数据模态(例如,脑成像和行为)之间的多元关联。然而,这些方法存在内在的局限性:(1)关于关联的统计推断通常不稳健;(2)每个数据模态内的关联未建模;(3)需要插补或删除缺失值。组因子分析(GFA)是一种层次模型,通过提供贝叶斯推断和建模模态特定关联来解决前两个限制。在这里,我们提出了一种扩展的 GFA,用于处理缺失数据,并强调 GFA 可以用作预测模型。我们将 GFA 应用于由大脑连通性和人类连接组计划(HCP)中的非成像测量组成的合成和真实数据。在合成数据中,GFA 揭示了潜在的共享和特定因素,并正确预测了完整和不完整数据集的未观察到的数据模态。在 HCP 数据中,我们确定了四个相关的共享因素,这些因素捕获了情绪、酒精和药物使用、认知、人口统计学和精神病理学测量以及默认模式、额顶控制、背侧和腹侧网络以及脑岛之间的关联,以及两个描述大脑连通性内关联的因素。此外,GFA 从大脑连通性预测了一组非成像测量。这些发现与完整和不完整数据集一致,并复制了文献中的先前发现。GFA 是一种很有前途的工具,可用于揭示基准数据集(如 HCP)中多个数据模态之间和内部的关联,并可轻松扩展到更复杂的模型,以解决更具挑战性的任务。