Song Ying, Sun Yansun, Weng Qi, Yi Li
Department of Neurology, Peking University Shenzhen Hospital, Shenzhen, China.
Department of Geriatrics, Peking University Shenzhen Hospital, Shenzhen, China.
Heliyon. 2024 Oct 19;10(20):e39575. doi: 10.1016/j.heliyon.2024.e39575. eCollection 2024 Oct 30.
Memory decline is the earliest symptom of various neurodegenerative disease, such as Alzheimer's disease (AD). However, accurately the prediction and identification of risk factors leading up to memory decline has remained limited.
The objective of this study is to create and verify a machine learning model that can accurately predict risk factors for memory decline among US adults.
A total of 9971 individuals were enrolled from the National Health and Nutrition Examination Survey (NHANES) 2015-2016 database. The least absolute shrinkage and selection operator (LASSO) was used to screen for characteristic predictors. Five machine learning (ML) algorithms: including Logistic Regression, ExtraTrees classifier, Bagging classifier, eXtreme Gradient Boosting (XGBoost), and Random Forest (RF) were employed. The performance of each model was evaluated by confusion matrix, area under curve (AUC), accuracy, precision, specificity, Recall and F1 scores.
The ultimate sample comprised 4525 subjects, of whom 7.7 % (N = 347) exhibited memory deterioration. The ExtraTrees classifier model and the XGBoost model demonstrated superior prediction performance and clinical value compared to other independent machine learning models, based on the AUC value of 0.915 and 0.911. Additionally, they consistently demonstrated accurate predicting ability for memory decline in the external datasets, with an AUC of 0.851 and 0.843, respectively.
The ExtraTrees classifier and the XGBoost models were the two outperformed models in predicting memory decline. Nevertheless, it is necessary to conduct future investigations to confirm the accuracy of our findings.
记忆衰退是各种神经退行性疾病(如阿尔茨海默病(AD))最早出现的症状。然而,准确预测和识别导致记忆衰退的风险因素仍然有限。
本研究的目的是创建并验证一个能够准确预测美国成年人记忆衰退风险因素的机器学习模型。
从2015 - 2016年美国国家健康与营养检查调查(NHANES)数据库中纳入了9971名个体。使用最小绝对收缩和选择算子(LASSO)筛选特征预测因子。采用了五种机器学习(ML)算法,包括逻辑回归、ExtraTrees分类器、Bagging分类器、极端梯度提升(XGBoost)和随机森林(RF)。通过混淆矩阵、曲线下面积(AUC)、准确率、精确率、特异性、召回率和F1分数评估每个模型的性能。
最终样本包括4525名受试者,其中7.7%(N = 347)表现出记忆衰退。基于0.915和0.911的AUC值,ExtraTrees分类器模型和XGBoost模型与其他独立机器学习模型相比,表现出卓越的预测性能和临床价值。此外,它们在外部数据集中始终表现出对记忆衰退的准确预测能力,AUC分别为0.851和0.843。
ExtraTrees分类器和XGBoost模型是预测记忆衰退方面表现最优的两个模型。然而,有必要进行未来的研究以确认我们研究结果的准确性。