De Bonis Maria Luigia Natalia, Fasano Giuseppe, Lombardi Angela, Ardito Carmelo, Ferrara Antonio, Di Sciascio Eugenio, Di Noia Tommaso
Department of Electrical and Information Engineering, Polytechnic University of Bari, Via E. Orabona, 4, 70125, Bari, Italy.
Brain Inform. 2024 Dec 18;11(1):33. doi: 10.1186/s40708-024-00244-9.
Brain age, a biomarker reflecting brain health relative to chronological age, is increasingly used in neuroimaging to detect early signs of neurodegenerative diseases and support personalized treatment plans. Two primary approaches for brain age prediction have emerged: morphometric feature extraction from MRI scans and deep learning (DL) applied to raw MRI data. However, a systematic comparison of these methods regarding performance, interpretability, and clinical utility has been limited. In this study, we present a comparative evaluation of two pipelines: one using morphometric features from FreeSurfer and the other employing 3D convolutional neural networks (CNNs). Using a multisite neuroimaging dataset, we assessed both model performance and the interpretability of predictions through eXplainable Artificial Intelligence (XAI) methods, applying SHAP to the feature-based pipeline and Grad-CAM and DeepSHAP to the CNN-based pipeline. Our results show comparable performance between the two pipelines in Leave-One-Site-Out (LOSO) validation, achieving state-of-the-art performance on the independent test set ( with DNN and morphometric features and with a DenseNet-121 architecture). SHAP provided the most consistent and interpretable results, while DeepSHAP exhibited greater variability. Further work is needed to assess the clinical utility of Grad-CAM. This study addresses a critical gap by systematically comparing the interpretability of multiple XAI methods across distinct brain age prediction pipelines. Our findings underscore the importance of integrating XAI into clinical practice, offering insights into how XAI outputs vary and their potential utility for clinicians.
脑龄是一种反映相对于实际年龄的大脑健康状况的生物标志物,在神经影像学中越来越多地用于检测神经退行性疾病的早期迹象并支持个性化治疗方案。目前已经出现了两种主要的脑龄预测方法:从MRI扫描中提取形态特征以及将深度学习(DL)应用于原始MRI数据。然而,关于这些方法在性能、可解释性和临床实用性方面的系统比较一直很有限。在本研究中,我们对两种流程进行了比较评估:一种使用来自FreeSurfer的形态特征,另一种采用3D卷积神经网络(CNN)。我们使用一个多站点神经影像数据集,通过可解释人工智能(XAI)方法评估了模型性能和预测的可解释性,将SHAP应用于基于特征的流程,将Grad-CAM和DeepSHAP应用于基于CNN的流程。我们的结果表明,在留一站点法(LOSO)验证中,两种流程的性能相当,在独立测试集上达到了当前的先进性能(使用DNN和形态特征时为 ,使用DenseNet-121架构时为 )。SHAP提供了最一致且可解释的结果,而DeepSHAP的变异性更大。需要进一步开展工作来评估Grad-CAM的临床实用性。本研究通过系统比较不同脑龄预测流程中多种XAI方法的可解释性,填补了一个关键空白。我们的研究结果强调了将XAI整合到临床实践中的重要性,深入了解了XAI输出的差异及其对临床医生的潜在实用性。