Sancho Martina Lapera, Ellis Charles A, Miller Robyn L, Calhoun Vince D
Tri-institutional Center for Translational Research in Neuroimaging and Data Science Georgia State University, Georgia Institute of Technology, and Emory University Atlanta, USA.
bioRxiv. 2024 Feb 13:2024.02.09.579600. doi: 10.1101/2024.02.09.579600.
The diagnosis of schizophrenia (SZ) can be challenging due to its diverse symptom presentation. As such, many studies have sought to identify diagnostic biomarkers of SZ using explainable machine learning methods. However, the generalizability of identified biomarkers in many machine learning-based studies is highly questionable given that most studies only analyze explanations from a small number of models. In this study, we present (1) a novel feature interaction-based explainability approach and (2) several new approaches for summarizing multi-model explanations. We implement our approach within the context of electroencephalogram (EEG) spectral power data. We further analyze both training and test set explanations with the goal of extracting generalizable insights from the models. Importantly, our analyses identify effects of SZ upon the α, β, and θ frequency bands, the left hemisphere of the brain, and interhemispheric interactions across a majority of folds. We hope that our analysis will provide helpful insights into SZ and inspire the development of robust approaches for identifying neuropsychiatric disorder biomarkers from explainable machine learning models.
精神分裂症(SZ)的诊断具有挑战性,因为其症状表现多样。因此,许多研究试图使用可解释的机器学习方法来识别SZ的诊断生物标志物。然而,鉴于大多数研究仅分析少数模型的解释,许多基于机器学习的研究中所识别生物标志物的可推广性受到高度质疑。在本研究中,我们提出了(1)一种基于特征交互的新型可解释性方法,以及(2)几种总结多模型解释的新方法。我们在脑电图(EEG)频谱功率数据的背景下实施我们的方法。我们进一步分析训练集和测试集的解释,目的是从模型中提取可推广的见解。重要的是,我们的分析确定了SZ对α、β和θ频段、大脑左半球以及大多数折叠中的半球间相互作用的影响。我们希望我们的分析将为SZ提供有益的见解,并激发从可解释的机器学习模型中识别神经精神疾病生物标志物的稳健方法的发展。