Department of Elderly Health Management, Shenzhen Center for Chronic Disease Control, No.2021, Buxin Road, Shenzhen, Guangdong, 518020, China.
Shenzhen Yiwei Technology Company, Shenzhen, Guangdong, 518000, China.
BMC Public Health. 2024 May 1;24(1):1206. doi: 10.1186/s12889-024-18692-7.
Dementia is a leading cause of disability in people older than 65 years worldwide. However, diagnosing dementia in its earliest symptomatic stages remains challenging. This study combined specific questions from the AD8 scale with comprehensive health-related characteristics, and used machine learning (ML) to construct diagnostic models of cognitive impairment (CI).
The study was based on the Shenzhen Healthy Ageing Research (SHARE) project, and we recruited 823 participants aged 65 years and older, who completed a comprehensive health assessment and cognitive function assessments. Permutation importance was used to select features. Five ML models using BalanceCascade were applied to predict CI: a support vector machine (SVM), multilayer perceptron (MLP), AdaBoost, gradient boosting decision tree (GBDT), and logistic regression (LR). An AD8 score ≥ 2 was used to define CI as a baseline. SHapley Additive exPlanations (SHAP) values were used to interpret the results of ML models.
The first and sixth items of AD8, platelets, waist circumference, body mass index, carcinoembryonic antigens, age, serum uric acid, white blood cells, abnormal electrocardiogram, heart rate, and sex were selected as predictive features. Compared to the baseline (AUC = 0.65), the MLP showed the highest performance (AUC: 0.83 ± 0.04), followed by AdaBoost (AUC: 0.80 ± 0.04), SVM (AUC: 0.78 ± 0.04), GBDT (0.76 ± 0.04). Furthermore, the accuracy, sensitivity and specificity of four ML models were higher than the baseline. SHAP summary plots based on MLP showed the most influential feature on model decision for positive CI prediction was female sex, followed by older age and lower waist circumference.
The diagnostic models of CI applying ML, especially the MLP, were substantially more effective than the traditional AD8 scale with a score of ≥ 2 points. Our findings may provide new ideas for community dementia screening and to promote such screening while minimizing medical and health resources.
痴呆症是全球 65 岁以上人群残疾的主要原因。然而,在最早的症状阶段诊断痴呆症仍然具有挑战性。本研究将 AD8 量表中的特定问题与全面的健康相关特征相结合,并使用机器学习 (ML) 构建认知障碍 (CI) 的诊断模型。
本研究基于深圳老龄化研究 (SHARE) 项目,共纳入 823 名 65 岁及以上的参与者,他们完成了全面的健康评估和认知功能评估。使用排列重要性选择特征。应用平衡级联的五种 ML 模型预测 CI:支持向量机 (SVM)、多层感知机 (MLP)、AdaBoost、梯度提升决策树 (GBDT) 和逻辑回归 (LR)。使用 AD8 评分≥2 定义 CI 作为基线。使用 Shapley 加性解释 (SHAP) 值解释 ML 模型的结果。
AD8 的第一项和第六项、血小板、腰围、体重指数、癌胚抗原、年龄、血尿酸、白细胞、异常心电图、心率和性别被选为预测特征。与基线相比(AUC=0.65),MLP 表现出最高的性能(AUC:0.83±0.04),其次是 AdaBoost(AUC:0.80±0.04)、SVM(AUC:0.78±0.04)、GBDT(0.76±0.04)。此外,四个 ML 模型的准确性、敏感性和特异性均高于基线。基于 MLP 的 SHAP 总结图显示,对阳性 CI 预测模型决策影响最大的特征是女性,其次是年龄较大和腰围较小。
应用 ML 的 CI 诊断模型,尤其是 MLP,比传统的 AD8 评分≥2 分的方法更有效。我们的发现可能为社区痴呆症筛查提供新思路,并在尽量减少医疗和卫生资源的情况下促进此类筛查。