Centre for Medical Image Computing, Department of Computer Science, University College London, London, United Kingdom; Max Planck University College London Centre for Computational Psychiatry and Ageing Research, University College London, London, United Kingdom.
Centre for Medical Image Computing, Department of Computer Science, University College London, London, United Kingdom; Max Planck University College London Centre for Computational Psychiatry and Ageing Research, University College London, London, United Kingdom.
Biol Psychiatry. 2020 Feb 15;87(4):368-376. doi: 10.1016/j.biopsych.2019.12.001. Epub 2019 Dec 10.
In 2009, the National Institute of Mental Health launched the Research Domain Criteria, an attempt to move beyond diagnostic categories and ground psychiatry within neurobiological constructs that combine different levels of measures (e.g., brain imaging and behavior). Statistical methods that can integrate such multimodal data, however, are often vulnerable to overfitting, poor generalization, and difficulties in interpreting the results.
We propose an innovative machine learning framework combining multiple holdouts and a stability criterion with regularized multivariate techniques, such as sparse partial least squares and kernel canonical correlation analysis, for identifying hidden dimensions of cross-modality relationships. To illustrate the approach, we investigated structural brain-behavior associations in an extensively phenotyped developmental sample of 345 participants (312 healthy and 33 with clinical depression). The brain data consisted of whole-brain voxel-based gray matter volumes, and the behavioral data included item-level self-report questionnaires and IQ and demographic measures.
Both sparse partial least squares and kernel canonical correlation analysis captured two hidden dimensions of brain-behavior relationships: one related to age and drinking and the other one related to depression. The applied machine learning framework indicates that these results are stable and generalize well to new data. Indeed, the identified brain-behavior associations are in agreement with previous findings in the literature concerning age, alcohol use, and depression-related changes in brain volume.
Multivariate techniques (such as sparse partial least squares and kernel canonical correlation analysis) embedded in our novel framework are promising tools to link behavior and/or symptoms to neurobiology and thus have great potential to contribute to a biologically grounded definition of psychiatric disorders.
2009 年,美国国立精神卫生研究所启动了“研究领域标准”,试图超越诊断类别,将精神病学建立在结合不同水平测量(如大脑成像和行为)的神经生物学结构之上。然而,能够整合此类多模态数据的统计方法往往容易过度拟合、泛化能力差,并且难以解释结果。
我们提出了一种创新的机器学习框架,该框架结合了多种保留和稳定性标准,以及正则化多元技术,如稀疏偏最小二乘法和核典型相关分析,以识别跨模态关系的隐藏维度。为了说明该方法,我们在一个经过广泛表型分析的 345 名参与者(312 名健康人和 33 名患有临床抑郁症)的发展性样本中研究了结构脑-行为关联。大脑数据包括全脑体素灰质体积,行为数据包括项目级自我报告问卷以及智商和人口统计学指标。
稀疏偏最小二乘法和核典型相关分析都捕捉到了脑-行为关系的两个隐藏维度:一个与年龄和饮酒有关,另一个与抑郁有关。所应用的机器学习框架表明,这些结果是稳定的,并且可以很好地推广到新数据。实际上,所确定的脑-行为关联与文献中关于年龄、酒精使用和与抑郁相关的大脑体积变化的先前发现一致。
嵌入我们新框架中的多元技术(如稀疏偏最小二乘法和核典型相关分析)是将行为和/或症状与神经生物学联系起来的有前途的工具,因此具有为精神病学的生物学基础定义做出贡献的巨大潜力。