Tan ChuXia, Liu Yuan, Li Lijun, Li Ying, Yang Pingting, Duan Yinglong, Wang Xingxing, Zhang Huiyi, Wang Jingying, Zhang Honglian
Health Management Medicine Center, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China.
Nursing Department, The Third Xiangya Hospital, Central South University, 138 Tongzi Po Road, Changsha, Hunan, 410013, China.
BMC Med Inform Decis Mak. 2025 Jul 28;25(1):280. doi: 10.1186/s12911-025-03123-3.
Hyperuricemia (HUA) as a global public health challenge, although its overall epidemiological characteristics have been widely reported, its age-specific risk pattern remains controversial. This study aims to reveal the risk factors of HUA in healthy physical examination populations of different age groups and construct a machine learning-driven risk prediction model to achieve precise intervention.
A cross-sectional study design was adopted. A total of 2821 physical examinees from a tertiary hospital from January 2022 to December 2024 were included and divided into 5 groups according to age (Group I: 18-30 years old, n = 185;) Group II: 31-40 years old, n = 532; Group III: 41-50 years old, n = 753; Group IV: 51-60 years old, n = 714; Group V > 60 years old, n = 637. Sociodemographic and health behavior data were collected through electronic questionnaires. Univariate analysis and binary Logistic regression were used in SPSS 27.0 to screen for independent risk factors (P < 0.05). Then, logistic regression (LR), random forest (RF) and eXtreme Gradient Boosting (XGBoost) models were constructed using Python 3.8.2, and rank the feature importance of the optimal model.
The overall detection rate of HUA was 22.8%, and the level of serum uric acid increased significantly with age. There were significant differences in risk factors among different age groups. The risk factors were significantly different among different age groups: In group I, fasting blood glucose was an independent risk factor for HUA; In Group II, six factors such as BMI and alanine aminotransferase were included in the regression model. The logistic regression model (AUC = 0.840) showed that creatinine, triglycerides and serum total protein were the key predictors. In group III, HUA was significantly correlated with six factors such as gender and alcohol consumption. The ranking of the importance of characteristics by logistic regression (AUC = 0.766) was creatinine, and gender, alcohol consumption. In Group IV, four factors such as gender and alcohol consumption were included in the regression model. Logistic regression (AUC = 0.749) showed that creatinine, serum total protein, and alcohol consumptionwere the main predictive indicators. In Group V, the influencing factors such as educational level and dietary taste on HUA were prominent. The top three characteristics in terms of importance in logistic regression (AUC = 0.756) were creatinine, gender, and alcohol consumption.
The high HUA detection rate in the health examination population and the significant differences in clinical characteristics across different age groups were confirmed. Machine learning models helped to deeply explore risk factors, confirming the association between health behaviors and examination status, providing a reference for age-stratified HUA intervention.
Not applicable.
高尿酸血症(HUA)作为一项全球性的公共卫生挑战,尽管其总体流行病学特征已有广泛报道,但其特定年龄的风险模式仍存在争议。本研究旨在揭示不同年龄组健康体检人群中HUA的危险因素,并构建机器学习驱动的风险预测模型以实现精准干预。
采用横断面研究设计。纳入2022年1月至2024年12月期间某三级医院的2821名体检者,并根据年龄分为5组(第一组:18 - 30岁,n = 185;第二组:31 - 40岁,n = 532;第三组:41 - 50岁,n = 753;第四组:51 - 60岁,n = 714;第五组:>60岁,n = 637)。通过电子问卷收集社会人口学和健康行为数据。在SPSS 27.0中进行单因素分析和二元Logistic回归以筛选独立危险因素(P < 0.05)。然后,使用Python 3.8.2构建Logistic回归(LR)、随机森林(RF)和极端梯度提升(XGBoost)模型,并对最优模型的特征重要性进行排序。
HUA的总体检出率为22.8%,血清尿酸水平随年龄显著升高。不同年龄组的危险因素存在显著差异。不同年龄组的危险因素差异显著:在第一组中,空腹血糖是HUA的独立危险因素;在第二组中,回归模型纳入了BMI和丙氨酸氨基转移酶等六个因素。Logistic回归模型(AUC = 0.840)显示,肌酐、甘油三酯和血清总蛋白是关键预测指标。在第三组中,HUA与性别和饮酒等六个因素显著相关。Logistic回归(AUC = 0.766)对特征重要性的排序为肌酐、性别、饮酒。在第四组中,回归模型纳入了性别和饮酒等四个因素。Logistic回归(AUC = 0.749)显示,肌酐、血清总蛋白和饮酒是主要预测指标。在第五组中,教育水平和饮食口味等对HUA的影响因素较为突出。Logistic回归(AUC = 0.756)中重要性排名前三的特征是肌酐、性别和饮酒。
证实了体检人群中HUA的高检出率以及不同年龄组临床特征的显著差异。机器学习模型有助于深入探索危险因素,证实健康行为与检查状况之间的关联,为按年龄分层的HUA干预提供参考。
不适用。