Wang Yuhang, Shi Shuang, Wei Xinghua, Wu Yanjing, Shi Yunlong, Cai Jin
Graduate School, Nantong University, Nantong, Jiangsu, People's Republic of China.
Department of Pediatrics, Affiliated Hospital of Nantong University, Nantong, Jiangsu, People's Republic of China.
Diabetes Metab Syndr Obes. 2025 Jul 7;18:2221-2233. doi: 10.2147/DMSO.S519284. eCollection 2025.
The concurrent rise of childhood obesity and hyperuricemia presents a serious public health concern. These conditions interact through complex metabolic mechanisms and significantly increase long-term risks of cardiometabolic diseases. Machine learning (ML) offers an effective framework for constructing efficient risk prediction models in pediatric populations.
This study aimed to develop and evaluate two ML models-Random Forest (RF) and Support Vector Classification (SVC)-to predict the risk of childhood obesity and hyperuricemia by integrating clinical and biochemical variables.
A total of 101 children were enrolled, including 60 with obesity and 41 with obesity plus hyperuricemia. Data preprocessing involved recursive feature elimination (RFE), ROSE-based oversampling, and feature standardization. Both RF and SVC models were trained and evaluated using area under the ROC curve (AUC), precision-recall curves, and calibration curves. SHAP (Shapley Additive Explanations) analysis was conducted to interpret feature contributions.
Both models demonstrated strong predictive performance, with AUCs reaching 0.96. The SVC model achieved slightly higher average precision and recall, making it more suitable for community- or school-based screening of high-risk children. In contrast, the RF model exhibited superior calibration, suggesting its greater utility in clinical decision-making where probabilistic risk estimation guides personalized follow-up or intervention planning. SHAP analysis identified glomerular filtration rate (GFR), high-density lipoprotein cholesterol (HDL-C), and apolipoprotein B (ApoB) as key predictors, some exhibiting nonlinear associations with disease risk.
RF and SVC models offer reliable tools for early risk prediction of obesity and hyperuricemia in children, each tailored to distinct clinical scenarios. These findings support early identification and targeted intervention. Future studies will explore the integration of metabolomic data and ensemble approaches to further enhance model performance and clinical applicability.
儿童肥胖症和高尿酸血症的同时增加是一个严重的公共卫生问题。这些病症通过复杂的代谢机制相互作用,并显著增加了心血管代谢疾病的长期风险。机器学习(ML)为构建针对儿科人群的高效风险预测模型提供了一个有效的框架。
本研究旨在开发和评估两种机器学习模型——随机森林(RF)和支持向量分类(SVC),通过整合临床和生化变量来预测儿童肥胖症和高尿酸血症的风险。
共纳入101名儿童,其中60名患有肥胖症,41名患有肥胖症合并高尿酸血症。数据预处理包括递归特征消除(RFE)、基于ROSE的过采样和特征标准化。使用ROC曲线下面积(AUC)、精确召回曲线和校准曲线对RF和SVC模型进行训练和评估。进行SHAP(Shapley加性解释)分析以解释特征贡献。
两种模型均表现出强大的预测性能,AUC达到0.96。SVC模型的平均精确率和召回率略高,使其更适合于基于社区或学校的高危儿童筛查。相比之下,RF模型表现出更好的校准效果,表明其在临床决策中更具实用性,其中概率风险估计指导个性化随访或干预计划。SHAP分析确定肾小球滤过率(GFR)、高密度脂蛋白胆固醇(HDL-C)和载脂蛋白B(ApoB)为关键预测因子,其中一些与疾病风险呈现非线性关联。
RF和SVC模型为儿童肥胖症和高尿酸血症的早期风险预测提供了可靠工具,每种模型都针对不同的临床场景进行了定制。这些发现支持早期识别和针对性干预。未来的研究将探索整合代谢组学数据和集成方法,以进一步提高模型性能和临床适用性。