The Framingham Heart Study, Framingham, MA 01701, USA.
Department of Medicine, University of Massachusetts Medical School, Worcester, MA 01655, USA.
Cells. 2022 Apr 30;11(9):1506. doi: 10.3390/cells11091506.
Blood biomarkers for dementia have the potential to identify preclinical disease and improve participant selection for clinical trials. Machine learning is an efficient analytical strategy to simultaneously identify multiple candidate biomarkers for dementia. We aimed to identify important candidate blood biomarkers for dementia using three machine learning models. We included 1642 (mean 69 ± 6 yr, 53% women) dementia-free Framingham Offspring Cohort participants attending examination, 7 who had available blood biomarker data. We developed three machine learning models, support vector machine (SVM), eXtreme gradient boosting of decision trees (XGB), and artificial neural network (ANN), to identify candidate biomarkers for incident dementia. Over a mean 12 ± 5 yr follow-up, 243 (14.8%) participants developed dementia. In multivariable models including all 38 available biomarkers, the XGB model demonstrated the strongest predictive accuracy for incident dementia (AUC 0.74 ± 0.01), followed by ANN (AUC 0.72 ± 0.01), and SVM (AUC 0.69 ± 0.01). Stepwise feature elimination by random sampling identified a subset of the nine most highly informative biomarkers. Machine learning models confined to these nine biomarkers showed improved model predictive accuracy for dementia (XGB, AUC 0.76 ± 0.01; ANN, AUC 0.75 ± 0.004; SVM, AUC 0.73 ± 0.01). A parsimonious panel of nine candidate biomarkers were identified which showed moderately good predictive accuracy for incident dementia, although our results require external validation.
血液生物标志物可用于识别痴呆症的临床前期疾病,并改善临床试验的参与者选择。机器学习是一种同时识别多个痴呆症候选生物标志物的有效分析策略。我们旨在使用三种机器学习模型来确定重要的痴呆症候选血液生物标志物。我们纳入了 1642 名(平均年龄 69 ± 6 岁,53%为女性)无痴呆的弗雷明汉后代队列参与者,其中 7 人有可用的血液生物标志物数据。我们开发了三种机器学习模型,支持向量机(SVM)、极端梯度提升决策树(XGB)和人工神经网络(ANN),以识别痴呆症的候选生物标志物。在平均 12 ± 5 年的随访中,243 名(14.8%)参与者发生了痴呆症。在包含所有 38 个可用生物标志物的多变量模型中,XGB 模型对痴呆症的预测准确性最强(AUC 0.74 ± 0.01),其次是 ANN(AUC 0.72 ± 0.01)和 SVM(AUC 0.69 ± 0.01)。随机抽样的逐步特征消除确定了 9 个最具信息量的生物标志物子集。机器学习模型仅限于这 9 个生物标志物,痴呆症的预测准确性有所提高(XGB,AUC 0.76 ± 0.01;ANN,AUC 0.75 ± 0.004;SVM,AUC 0.73 ± 0.01)。确定了一组九个具有中等良好预测准确性的候选生物标志物,用于识别痴呆症的发生,尽管我们的结果需要外部验证。