Zarei Soheil, Saffar Mohsen, Shalbaf Reza, Hassani Abharian Peyman, Shalbaf Ahmad
Institute for Cognitive Science Studies, Tehran, Iran.
Department of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran.
Phys Eng Sci Med. 2025 Aug 11. doi: 10.1007/s13246-025-01618-x.
Alzheimer's disease (AD) is a neurodegenerative disorder that challenges early diagnosis and intervention, yet the black-box nature of many predictive models limits clinical adoption. In this study, we developed an advanced machine learning (ML) framework that integrates hierarchical feature selection with multiple classifiers to predict progression from mild cognitive impairment (MCI) to AD. Using baseline data from 580 participants in the Alzheimer's Disease Neuroimaging Initiative (ADNI), categorized into stable MCI (sMCI) and progressive MCI (pMCI) subgroups, we analyzed features both individually and across seven key groups. The neuropsychological test group exhibited the highest predictive power, with several of the top individual predictors drawn from this domain. Hierarchical feature selection combining initial statistical filtering and machine learning based refinement, narrowed the feature set to the eight most informative variables. To demystify model decisions, we applied SHAP-based (SHapley Additive exPlanations) explainability analysis, quantifying each feature's contribution to conversion risk. The explainable random forest classifier, optimized on these selected features, achieved 83.79% accuracy (84.93% sensitivity, 83.32% specificity), outperforming other methods and revealing hippocampal volume, delayed memory recall (LDELTOTAL), and Functional Activities Questionnaire (FAQ) scores as the top drivers of conversion. These results underscore the effectiveness of combining diverse data sources with advanced ML models, and demonstrate that transparent, SHAP-driven insights align with known AD biomarkers, transforming our model from a predictive black box into a clinically actionable tool for early diagnosis and patient stratification.
阿尔茨海默病(AD)是一种神经退行性疾病,对早期诊断和干预构成挑战,然而许多预测模型的黑箱性质限制了其在临床上的应用。在本研究中,我们开发了一种先进的机器学习(ML)框架,该框架将分层特征选择与多个分类器相结合,以预测从轻度认知障碍(MCI)到AD的进展。利用阿尔茨海默病神经影像学倡议(ADNI)中580名参与者的基线数据,将其分为稳定MCI(sMCI)和进展性MCI(pMCI)亚组,我们分别分析了各个特征以及七个关键组的特征。神经心理学测试组表现出最高的预测能力,几个顶级个体预测因子均来自该领域。结合初始统计过滤和基于机器学习的优化的分层特征选择,将特征集缩小到八个最具信息性的变量。为了揭开模型决策的神秘面纱,我们应用了基于SHAP(SHapley Additive exPlanations)的可解释性分析,量化了每个特征对转化风险的贡献。在这些选定特征上进行优化的可解释随机森林分类器,准确率达到83.79%(敏感性84.93%,特异性83.32%),优于其他方法,并揭示海马体积、延迟记忆回忆(LDELTOTAL)和功能活动问卷(FAQ)得分是转化的主要驱动因素。这些结果强调了将不同数据源与先进ML模型相结合的有效性,并表明透明的、由SHAP驱动的见解与已知的AD生物标志物一致,将我们的模型从一个预测黑箱转变为一个用于早期诊断和患者分层的临床可操作工具。