Nakua Hajer, Yu Ju-Chi, Abdi Hervé, Hawco Colin, Voineskos Aristotle, Hill Sean, Lai Meng-Chuan, Wheeler Anne L, McIntosh Anthony Randal, Ameis Stephanie H
Campbell Family Mental Health Research Institute, Centre for Addiction and Mental Health, Toronto, Ontario, Canada.
Institute of Medical Science, University of Toronto, Toronto, Ontario, Canada.
bioRxiv. 2023 Mar 9:2023.03.08.531763. doi: 10.1101/2023.03.08.531763.
Canonical Correlation Analysis (CCA) and Partial Least Squares Correlation (PLS) detect associations between two data matrices based on computing a linear combination between the two matrices (called latent variables; LVs). These LVs maximize correlation (CCA) and covariance (PLS). These different maximization criteria may render one approach more stable and reproducible than the other when working with brain and behavioural data at the population-level. This study compared the LVs which emerged from CCA and PLS analyses of brain-behaviour relationships from the Adolescent Brain Cognitive Development (ABCD) dataset and examined their stability and reproducibility.
Structural T1-weighted imaging and behavioural data were accessed from the baseline Adolescent Brain Cognitive Development dataset ( > 9000, ages = 9-11 years). The brain matrix consisted of cortical thickness estimates in different cortical regions. The behavioural matrix consisted of 11 subscale scores from the parent-reported Child Behavioral Checklist (CBCL) or 7 cognitive performance measures from the NIH Toolbox. CCA and PLS models were separately applied to the brain-CBCL analysis and brain-cognition analysis. A permutation test was used to assess whether identified LVs were statistically significant. A series of resampling statistical methods were used to assess stability and reproducibility of the LVs.
When examining the relationship between cortical thickness and CBCL scores, the first LV was found to be significant across both CCA and PLS models (singular value: CCA = .13, PLS = .39, < .001). LV from the CCA model found that covariation of CBCL scores was linked to covariation of cortical thickness. LV from the PLS model identified decreased cortical thickness linked to lower CBCL scores. There was limited evidence of stability or reproducibility of LV for both CCA and PLS. When examining the relationship between cortical thickness and cognitive performance, there were 6 significant LVs for both CCA and PLS ( < .01). The first LV showed similar relationships between CCA and PLS and was found to be stable and reproducible (singular value: CCA = .21, PLS = .43, < .001).
CCA and PLS identify different brain-behaviour relationships with limited stability and reproducibility when examining the relationship between cortical thickness and parent-reported behavioural measures. However, both methods identified relatively similar brain-behaviour relationships that were stable and reproducible when examining the relationship between cortical thickness and cognitive performance. The results of the current study suggest that stability and reproducibility of brain-behaviour relationships identified by CCA and PLS are influenced by characteristics of the analyzed sample and the included behavioural measurements when applied to a large pediatric dataset.
典型相关分析(CCA)和偏最小二乘相关分析(PLS)基于计算两个数据矩阵之间的线性组合(称为潜在变量;LVs)来检测两个数据矩阵之间的关联。这些潜在变量使相关性(CCA)和协方差(PLS)最大化。在处理群体水平的大脑和行为数据时,这些不同的最大化标准可能使一种方法比另一种方法更稳定且可重复。本研究比较了从青少年大脑认知发展(ABCD)数据集中对大脑与行为关系进行CCA和PLS分析得出的潜在变量,并检验了它们的稳定性和可重复性。
从青少年大脑认知发展数据集基线(>9000人,年龄9 - 11岁)获取结构T1加权成像和行为数据。大脑矩阵由不同皮质区域的皮质厚度估计值组成。行为矩阵由家长报告的儿童行为检查表(CBCL)的11个分量表得分或美国国立卫生研究院工具箱的7项认知表现测量值组成。CCA和PLS模型分别应用于大脑 - CBCL分析和大脑 - 认知分析。使用置换检验来评估所确定的潜在变量是否具有统计学意义。使用一系列重采样统计方法来评估潜在变量的稳定性和可重复性。
在研究皮质厚度与CBCL得分之间的关系时,发现第一个潜在变量在CCA和PLS模型中均具有显著性(奇异值:CCA = 0.13,PLS = 0.39,P < 0.001)。CCA模型的潜在变量发现CBCL得分的协方差与皮质厚度的协方差相关。PLS模型的潜在变量表明皮质厚度降低与较低的CBCL得分相关。对于CCA和PLS,潜在变量的稳定性或可重复性证据有限。在研究皮质厚度与认知表现之间的关系时,CCA和PLS均有6个显著的潜在变量(P < 0.01)。第一个潜在变量在CCA和PLS之间显示出相似的关系,并且被发现是稳定且可重复的(奇异值:CCA = 0.21,PLS = 0.43,P < 0.001)。
在研究皮质厚度与家长报告的行为测量之间的关系时,CCA和PLS识别出不同的大脑 - 行为关系,其稳定性和可重复性有限。然而,在研究皮质厚度与认知表现之间的关系时,两种方法都识别出相对相似的稳定且可重复的大脑 - 行为关系。当前研究结果表明,当应用于大型儿科数据集时,CCA和PLS所识别的大脑 - 行为关系的稳定性和可重复性受分析样本的特征和所纳入的行为测量的影响。