Xiao Xuan, Li Yihui, Wu Qiaoboyang, Liu Xinting, Cao Xu, Li Maiping, Liu Jianjing, Gong Lianggeng, Dai Xi-Jian
Department of Radiology, The Second Affiliated Hospital, Jiangxi Medical College, Nanchang University, Minde Road No. 1, Nanchang, Jiangxi Province, 330006, China.
Jiangxi Provincial Key Laboratory of Intelligent Medical Imaging, Nanchang, 330006, China.
Alzheimers Res Ther. 2025 May 13;17(1):103. doi: 10.1186/s13195-025-01750-6.
Depression serves as a prodromal symptom of dementia, and individuals with depression exhibit a significantly higher risk of developing dementia. The aim of this study is to develop and validate a novel dementia risk prediction tool among middle-aged and elderly individuals with depression based on machine learning algorithms.
This study included 31,587 middle-aged and elderly individuals with depression who did not have a diagnosis of dementia at baseline from a large UK population-based prospective cohort. A rigorous variable selection strategy was employed to identify risk and protective factors of dementia from an initial pool of 190 candidate variables, ultimately retaining 27 variables. Eight distinct data analysis strategies were utilized to develop and validate the dementia risk prediction model. The DeLong's test was applied to compare the statistical differences between different models.
During a median follow-up of 7.98 years, 896 incident dementia cases were identified among study participants. In model development employing an 8:2 data split (fivefold cross-validation for training), the Adaboost classifier achieved the optimal performance (AUC 0.861 ± 0.003), followed by XGBoost (AUC 0.839 ± 0.005) and CatBoost (AUC 0.828 ± 0.007) classifiers. To facilitate community generalization and clinical applicability, we develop a simplified model through a forward feature subset selection algorithm, retaining 12 variables. The simplified model maintained robust performance, with AdaBoost achieving the highest discriminative ability (AUC 0.859 ± 0.002), followed by XGBoost (AUC 0.835 ± 0.001) and CatBoost (AUC 0.821 ± 0.005). The DeLong's test revealed no statistically significant difference in AUC values between models using 12 and 27 variables (p = 0.278). For practical implementation, we deployed the optimal model to a web application for visualization and dementia risk assessment, named DRP-Depression.
We developed a practical and easy-to-promote risk prediction model based on machine learning algorithms, and deployed it to a web application to provide a new and convenient tool for dementia risk prediction in the middle-aged and elderly individuals with depression.
抑郁症是痴呆症的前驱症状,抑郁症患者患痴呆症的风险显著更高。本研究的目的是基于机器学习算法开发并验证一种针对中老年抑郁症患者的新型痴呆症风险预测工具。
本研究纳入了来自英国一个大型基于人群的前瞻性队列的31587名中老年抑郁症患者,这些患者在基线时未被诊断为痴呆症。采用严格的变量选择策略,从190个候选变量的初始集合中识别痴呆症的风险和保护因素,最终保留27个变量。利用八种不同的数据分析策略来开发和验证痴呆症风险预测模型。应用德龙检验来比较不同模型之间的统计差异。
在中位随访7.98年期间,研究参与者中确定了896例新发痴呆症病例。在采用8:2数据分割(五折交叉验证用于训练)的模型开发中,Adaboost分类器表现最佳(AUC 0.861±0.003),其次是XGBoost(AUC 0.839±0.005)和CatBoost(AUC 0.828±0.007)分类器。为便于社区推广和临床应用,我们通过前向特征子集选择算法开发了一个简化模型,保留12个变量。简化模型保持了稳健的性能,AdaBoost具有最高的判别能力(AUC 0.859±0.002),其次是XGBoost(AUC 0.835±0.001)和CatBoost(AUC 0.821±0.005)。德龙检验显示,使用12个变量和27个变量的模型之间的AUC值没有统计学显著差异(p = 0.278)。为了实际应用,我们将最优模型部署到一个名为DRP-Depression的网络应用程序中,用于可视化和痴呆症风险评估。
我们基于机器学习算法开发了一种实用且易于推广的风险预测模型,并将其部署到网络应用程序中,为中老年抑郁症患者的痴呆症风险预测提供了一种新的便捷工具。