Ren Hao, Zheng Yiying, Li Changjin, Jing Fengshi, Wang Qiting, Luo Zeyu, Li Dongxiao, Liang Deyi, Tang Weiming, Liu Li, Cheng Weibin
Institute for Healthcare Artificial Intelligence Application, The Affiliated Guangdong Second Provincial General Hospital of Jinan University, No. 466 Xingangzhong Road, Haizhu District, Guangzhou, 510317, China, 86 13929587059.
Faculty of Data Science, City University of Macau, Macao SAR, China.
JMIR Aging. 2025 Apr 30;8:e67437. doi: 10.2196/67437.
Cognitive impairment, indicative of Alzheimer disease and other forms of dementia, significantly deteriorates the quality of life of older adult populations and imposes considerable burdens on families and health care systems worldwide. The early identification of individuals at risk for cognitive impairment through a convenient and rapid method is crucial for the timely implementation of interventions.
The objective of this study was to explore the application of machine learning (ML) to integrate blood biomarkers, life behaviors, and disease history to predict the decline in cognitive function.
This approach uses data from the Chinese Longitudinal Healthy Longevity Survey. A total of 2688 participants aged 65 years or older from the 2008-2009, 2011-2012, and 2014 Chinese Longitudinal Healthy Longevity Survey waves were included, with cognitive impairment defined as a Mini-Mental State Examination (MMSE) score below 18. The dataset was divided into a training set (n=1331), an internal test set (n=333), and a prospective validation set (n=1024). Participants with a baseline MMSE score of less than 18 were excluded from the cohort to ensure a more accurate assessment of cognitive function. We developed ML models that integrate demographic information, health behaviors, disease history, and blood biomarkers to predict cognitive function at the 3-year follow-up point, specifically identifying individuals who are at risk of experiencing significant declines in cognitive function by that time. Specifically, the models aimed to identify individuals who would experience a significant decline in their MMSE scores (less than 18) by the end of the follow-up period. The performance of these models was evaluated using metrics including accuracy, sensitivity, and the area under the receiver operating characteristic curve.
All ML models outperformed the MMSE alone. The balanced random forest achieved the highest accuracy (88.5% in the internal test set and 88.7% in the prospective validation set), albeit with a lower sensitivity, while logistic regression recorded the highest sensitivity. SHAP (Shapley Additive Explanations) analysis identified instrumental activities of daily living, age, and baseline MMSE scores as the most influential predictors for cognitive impairment.
The incorporation of blood biomarkers, along with demographic, life behavior, and disease history into ML models offers a convenient, rapid, and accurate approach for the early identification of older adult individuals at risk of cognitive impairment. This method presents a valuable tool for health care professionals to facilitate timely interventions and underscores the importance of integrating diverse data types in predictive health models.
认知障碍是阿尔茨海默病和其他形式痴呆症的指征,它会显著降低老年人群的生活质量,并给全球范围内的家庭和医疗保健系统带来相当大的负担。通过便捷快速的方法早期识别有认知障碍风险的个体对于及时实施干预措施至关重要。
本研究的目的是探索应用机器学习(ML)整合血液生物标志物、生活行为和疾病史来预测认知功能下降。
该方法使用了中国健康与养老追踪调查(CLHLS)的数据。纳入了2008 - 2009年、2011 - 2012年和2014年中国健康与养老追踪调查中2688名65岁及以上的参与者,将认知障碍定义为简易精神状态检查表(MMSE)得分低于18分。数据集被分为训练集(n = 1331)、内部测试集(n = 333)和前瞻性验证集(n = 1024)。将基线MMSE得分低于18分的参与者排除在队列之外,以确保对认知功能进行更准确的评估。我们开发了机器学习模型,整合人口统计学信息、健康行为、疾病史和血液生物标志物来预测3年随访时的认知功能,特别识别出到那时有认知功能显著下降风险的个体。具体而言,这些模型旨在识别出在随访期结束时MMSE得分会显著下降(低于18分)的个体。使用包括准确率、灵敏度和受试者工作特征曲线下面积等指标评估这些模型的性能。
所有机器学习模型的表现均优于单独使用MMSE。平衡随机森林模型的准确率最高(内部测试集中为88.5%,前瞻性验证集中为88.7%),尽管灵敏度较低,而逻辑回归的灵敏度最高。SHAP(Shapley值法)分析确定日常生活活动能力、年龄和基线MMSE得分是认知障碍最具影响力的预测因素。
将血液生物标志物与人口统计学、生活行为和疾病史纳入机器学习模型,为早期识别有认知障碍风险的老年人提供了一种便捷、快速且准确的方法。这种方法为医疗保健专业人员提供了一个有价值的工具,便于及时进行干预,并强调了在预测性健康模型中整合多种数据类型的重要性。