Li Wanyue, Zeng Li, Yuan Shiqi, Shang Yaru, Zhuang Weisheng, Chen Zhuoming, Lyu Jun
Department of Rehabilitation, The First Affiliated Hospital of Jinan University, Guangzhou, Guangdong, China.
The Second Clinical Medical College of Guizhou University of Traditional Chinese Medicine, Guiyang, Guizhou, China.
Front Neurosci. 2023 Apr 27;17:1158141. doi: 10.3389/fnins.2023.1158141. eCollection 2023.
The purpose of this study was to develop and validate a predictive model of cognitive impairment in older adults based on a novel machine learning (ML) algorithm.
The complete data of 2,226 participants aged 60-80 years were extracted from the 2011-2014 National Health and Nutrition Examination Survey database. Cognitive abilities were assessed using a composite cognitive functioning score (Z-score) calculated using a correlation test among the Consortium to Establish a Registry for Alzheimer's Disease Word Learning and Delayed Recall tests, Animal Fluency Test, and the Digit Symbol Substitution Test. Thirteen demographic characteristics and risk factors associated with cognitive impairment were considered: age, sex, race, body mass index (BMI), drink, smoke, direct HDL-cholesterol level, stroke history, dietary inflammatory index (DII), glycated hemoglobin (HbA1c), Patient Health Questionnaire-9 (PHQ-9) score, sleep duration, and albumin level. Feature selection is performed using the Boruta algorithm. Model building is performed using ten-fold cross-validation, machine learning (ML) algorithms such as generalized linear model (GLM), random forest (RF), support vector machine (SVM), artificial neural network (ANN), and stochastic gradient boosting (SGB). The performance of these models was evaluated in terms of discriminatory power and clinical application.
The study ultimately included 2,226 older adults for analysis, of whom 384 (17.25%) had cognitive impairment. After random assignment, 1,559 and 667 older adults were included in the training and test sets, respectively. A total of 10 variables such as age, race, BMI, direct HDL-cholesterol level, stroke history, DII, HbA1c, PHQ-9 score, sleep duration, and albumin level were selected to construct the model. GLM, RF, SVM, ANN, and SGB were established to obtain the area under the working characteristic curve of the test set subjects 0.779, 0.754, 0.726, 0.776, and 0.754. Among all models, the GLM model had the best predictive performance in terms of discriminatory power and clinical application.
ML models can be a reliable tool to predict the occurrence of cognitive impairment in older adults. This study used machine learning methods to develop and validate a well performing risk prediction model for the development of cognitive impairment in the elderly.
本研究旨在基于一种新型机器学习(ML)算法开发并验证一种老年人认知障碍预测模型。
从2011 - 2014年国家健康与营养检查调查数据库中提取了2226名年龄在60 - 80岁参与者的完整数据。认知能力通过综合认知功能评分(Z分数)进行评估,该评分是利用阿尔茨海默病注册协会词汇学习与延迟回忆测试、动物流畅性测试以及数字符号替换测试之间的相关性测试计算得出的。考虑了13种与认知障碍相关的人口统计学特征和风险因素:年龄、性别、种族、体重指数(BMI)、饮酒、吸烟、直接高密度脂蛋白胆固醇水平、中风病史、饮食炎症指数(DII)、糖化血红蛋白(HbA1c)、患者健康问卷 - 9(PHQ - 9)评分、睡眠时间以及白蛋白水平。使用Boruta算法进行特征选择。使用十折交叉验证、机器学习(ML)算法如广义线性模型(GLM)、随机森林(RF)、支持向量机(SVM)、人工神经网络(ANN)和随机梯度提升(SGB)进行模型构建。根据区分能力和临床应用对这些模型的性能进行评估。
该研究最终纳入2226名老年人进行分析,其中384名(17.25%)患有认知障碍。随机分配后,分别有1559名和667名老年人被纳入训练集和测试集。共选择了年龄、种族、BMI、直接高密度脂蛋白胆固醇水平、中风病史、DII、HbA1c、PHQ - 9评分、睡眠时间和白蛋白水平等10个变量来构建模型。建立了GLM、RF、SVM、ANN和SGB模型,测试集受试者工作特征曲线下面积分别为0.779、0.754、0.726、0.776和0.754。在所有模型中,GLM模型在区分能力和临床应用方面具有最佳预测性能。
ML模型可以成为预测老年人认知障碍发生的可靠工具。本研究使用机器学习方法开发并验证了一种性能良好的老年人认知障碍发生风险预测模型。