Department of Psychiatry, University of Vermont College of Medicine, Burlington, Vermont, USA.
Department of Psychology, University of Amsterdam, Amsterdam, the Netherlands.
Hum Brain Mapp. 2022 Jan;43(1):555-565. doi: 10.1002/hbm.25248. Epub 2020 Oct 16.
To identify neuroimaging biomarkers of alcohol dependence (AD) from structural magnetic resonance imaging, it may be useful to develop classification models that are explicitly generalizable to unseen sites and populations. This problem was explored in a mega-analysis of previously published datasets from 2,034 AD and comparison participants spanning 27 sites curated by the ENIGMA Addiction Working Group. Data were grouped into a training set used for internal validation including 1,652 participants (692 AD, 24 sites), and a test set used for external validation with 382 participants (146 AD, 3 sites). An exploratory data analysis was first conducted, followed by an evolutionary search based feature selection to site generalizable and high performing subsets of brain measurements. Exploratory data analysis revealed that inclusion of case- and control-only sites led to the inadvertent learning of site-effects. Cross validation methods that do not properly account for site can drastically overestimate results. Evolutionary-based feature selection leveraging leave-one-site-out cross-validation, to combat unintentional learning, identified cortical thickness in the left superior frontal gyrus and right lateral orbitofrontal cortex, cortical surface area in the right transverse temporal gyrus, and left putamen volume as final features. Ridge regression restricted to these features yielded a test-set area under the receiver operating characteristic curve of 0.768. These findings evaluate strategies for handling multi-site data with varied underlying class distributions and identify potential biomarkers for individuals with current AD.
为了从结构磁共振成像中确定酒精依赖(AD)的神经影像学生物标志物,开发可明确推广到未见站点和人群的分类模型可能会有所帮助。ENIGMA 成瘾工作组对来自 27 个站点的 2034 名 AD 和对照参与者的先前发表的数据集进行了荟萃分析,探讨了这一问题。数据分为训练集和测试集,用于内部验证,包括 1652 名参与者(692 名 AD,24 个站点),以及用于外部验证的 382 名参与者(146 名 AD,3 个站点)。首先进行了探索性数据分析,然后进行了基于进化搜索的特征选择,以获得可推广到站点的、性能较高的大脑测量子集。探索性数据分析显示,包含病例和对照仅站点会导致站点效应的无意学习。不正确考虑站点的交叉验证方法可能会极大地高估结果。利用留一站点交叉验证来对抗无意学习的基于进化的特征选择,确定了左侧额上回和右侧外侧眶额皮质的皮质厚度、右侧横颞回的皮质表面积以及左侧壳核体积作为最终特征。限制在这些特征上的岭回归在测试集上的接收器操作特征曲线下面积为 0.768。这些发现评估了处理具有不同潜在类别分布的多站点数据的策略,并确定了当前 AD 个体的潜在生物标志物。