Li Zhao, Wu Wenzhong, Kang Hyunsik
College of Sport Science, Sungkyunkwan University, Suwon 16419, Republic of Korea.
Healthcare (Basel). 2024 Dec 13;12(24):2527. doi: 10.3390/healthcare12242527.
This study aimed to develop and validate a machine learning (ML)-based metabolic syndrome (MetS) risk prediction model. We examined data from 6155 participants of the China Health and Retirement Longitudinal Study (CHARLS) in 2011. The LASSO regression feature selection identified the best MetS predictors. Nine ML-based algorithms were adopted to build predictive models. The model performance was validated using cohort data from the Korea National Health and Nutrition Examination Survey (KNHANES) ( = 5297), the United Kingdom (UK) Biobank ( = 218,781), and the National Health and Nutrition Examination Survey (NHANES) ( = 2549). : The multilayer perceptron (MLP)-based model performed best in the CHARLS cohort (AUC = 0.8908; PRAUC = 0.8073), the logistic model in the KNHANES cohort (AUC = 0.9101, PRAUC = 0.8116), the xgboost model in the UK Biobank cohort (AUC = 0.8556, PRAUC = 0.6246), and the MLP model in the NHANES cohort (AUC = 0.9055, PRAUC = 0.8264). Our MLP-based model has the potential to serve as a clinical application for detecting MetS in different populations.
本研究旨在开发并验证一种基于机器学习(ML)的代谢综合征(MetS)风险预测模型。我们考察了2011年中国健康与养老追踪调查(CHARLS)中6155名参与者的数据。套索回归特征选择确定了最佳的MetS预测因子。采用了9种基于ML的算法来构建预测模型。使用来自韩国国民健康与营养检查调查(KNHANES)(n = 5297)、英国生物银行(UK Biobank)(n = 218,781)和美国国家健康与营养检查调查(NHANES)(n = 2549)的队列数据对模型性能进行验证。结果:基于多层感知器(MLP)的模型在CHARLS队列中表现最佳(AUC = 0.8908;PRAUC = 0.8073),逻辑模型在KNHANES队列中表现最佳(AUC = 0.9101,PRAUC = 0.8116),极端梯度提升(xgboost)模型在英国生物银行队列中表现最佳(AUC = 0.8556,PRAUC = 0.6246),MLP模型在美国国家健康与营养检查调查队列中表现最佳(AUC = 0.9055,PRAUC = 0.8264)。我们基于MLP的模型有潜力作为一种临床应用,用于在不同人群中检测MetS。