Mao Lijun, Lin Luotao, Shi Zumin, Song Hualing, Zhao Hailei, Xu Xianglong
School of Public Health, Shanghai University of Traditional Chinese Medicine, Shanghai, China.
Nutrition and Dietetics Program, Department of Individual, Family, and Community Education, University of New Mexico, United States.
Heliyon. 2024 Sep 19;10(18):e38124. doi: 10.1016/j.heliyon.2024.e38124. eCollection 2024 Sep 30.
Multimorbidity, particularly diabetes combined with hypertension (DCH), is a significant public health concern. Currently, there is a gap in research utilizing machine learning (ML) algorithms to predict hypertension risk in Chinese middle-aged and elderly diabetic patients, and gender differences in DCH comorbidity patterns remain unclear. We aimed to use ML algorithms to predict DCH and identify its determinants among middle-aged and elderly diabetic patients in China.
Cross-sectional study.
Data were collected on 2775 adults with diabetes aged ≥45 years from the 2015 China Health and Retirement Longitudinal Study. We employed nine ML algorithms to develop prediction models for DCH. The performance of these models was evaluated using the area under the curve (AUC). Additionally, we conducted variable importance analysis to identify key determinants.
Our results showed that the best prediction models for the overall population, men, and women were extreme gradient boosting (AUC = 0.728), light gradient boosting machine (AUC = 0.734), and random forest (AUC = 0.737), respectively. Age, waist circumference, body mass index, creatinine level, triglycerides, taking Western medicine, high-density lipoprotein cholesterol, blood urea nitrogen, total cholesterol, low-density lipoprotein cholesterol, and sleep disorders were identified as common important predictors by all three populations.
ML algorithms showed accurate predictive capabilities for DCH. Overall, non-linear ML models outperformed traditional logistic regression for predicting DCH. DCH predictions exhibited variations in predictors and model accuracy by gender. These findings could help identify DCH early and inform the development of personalized intervention strategies.
多种疾病并存,尤其是糖尿病合并高血压(DCH),是一个重大的公共卫生问题。目前,利用机器学习(ML)算法预测中国中老年糖尿病患者高血压风险的研究存在空白,且DCH合并模式的性别差异仍不明确。我们旨在使用ML算法预测DCH,并在中国中老年糖尿病患者中确定其决定因素。
横断面研究。
从2015年中国健康与养老追踪调查中收集了2775名年龄≥45岁的成年糖尿病患者的数据。我们采用九种ML算法开发DCH预测模型。使用曲线下面积(AUC)评估这些模型的性能。此外,我们进行了变量重要性分析以确定关键决定因素。
我们的结果表明,总体人群、男性和女性的最佳预测模型分别是极端梯度提升(AUC = 0.728)、轻梯度提升机(AUC = 0.734)和随机森林(AUC = 0.737)。年龄、腰围、体重指数、肌酐水平、甘油三酯、服用西药、高密度脂蛋白胆固醇、血尿素氮、总胆固醇、低密度脂蛋白胆固醇和睡眠障碍被所有三组人群确定为常见的重要预测因素。
ML算法对DCH显示出准确的预测能力。总体而言,非线性ML模型在预测DCH方面优于传统逻辑回归。DCH预测在预测因素和模型准确性方面存在性别差异。这些发现有助于早期识别DCH,并为制定个性化干预策略提供参考。