Araujo-Moura Keisyanne, Souza Letícia, de Oliveira Tiago Almeida, Rocha Mateus Silva, De Moraes Augusto César Ferreira, Chiavegatto Filho Alexandre
Department of Epidemiology, School of Public Health, University of São Paulo, São Paulo, Brazil.
Department of Statistic, State University of Paraíba, Campina Grande, Paraíba, Brazil.
Int J Public Health. 2025 Mar 11;70:1607944. doi: 10.3389/ijph.2025.1607944. eCollection 2025.
To develop a machine learning (ML) model utilizing transfer learning (TL) techniques to predict hypertension in children and adolescents across South America.
Data from two cohorts (children and adolescents) in seven South American cities were analyzed. A TL strategy was implemented by transferring knowledge from a CatBoost model trained on the children's sample and adapting it to the adolescent sample. Model performance was evaluated using standard metrics.
Among children, the prevalence of normal blood pressure was 88.9% (301 participants), while 14.1% (50 participants) had elevated blood pressure (EBP). In the adolescent group, the prevalence of normal blood pressure was 92.5% (284 participants), with 7.5% (23 participants) presenting with EBP. Random Forest, XGBoost, and LightGBM achieved high accuracy (0.90) for children, with XGBoost and LightGBM demonstrating superior recall (0.50) and AUC-ROC (0.74). For adolescents, models without TL showed poor performance, with accuracy and recall values remaining low and AUC-ROC ranging from 0.46 to 0.56. After applying TL, model performance improved significantly, with CatBoost achieving an AUC-ROC of 0.82, accuracy of 1.0, and recall of 0.18.
Soft drinks, filled cookies, and chips were key dietary predictors of elevated blood pressure, with higher intake in adolescents. Machine learning with transfer learning effectively identified these risks, emphasizing the need for early dietary interventions to prevent hypertension and support cardiovascular health in pediatric populations.
开发一种利用迁移学习(TL)技术的机器学习(ML)模型,以预测南美洲儿童和青少年的高血压情况。
分析了来自南美洲七个城市的两个队列(儿童和青少年)的数据。通过将在儿童样本上训练的CatBoost模型的知识进行迁移并使其适应青少年样本,实施了一种迁移学习策略。使用标准指标评估模型性能。
在儿童中,正常血压的患病率为88.9%(301名参与者),而14.1%(50名参与者)血压升高(EBP)。在青少年组中,正常血压的患病率为92.5%(284名参与者),7.5%(23名参与者)血压升高。随机森林、XGBoost和LightGBM在儿童中实现了较高的准确率(0.90),XGBoost和LightGBM表现出更高的召回率(0.50)和AUC-ROC(0.74)。对于青少年,未使用迁移学习的模型表现不佳,准确率和召回率值较低,AUC-ROC在0.46至0.56之间。应用迁移学习后,模型性能显著提高,CatBoost的AUC-ROC为0.82,准确率为1.0,召回率为0.18。
软饮料、夹心饼干和薯片是血压升高的关键饮食预测因素,青少年摄入量更高。采用迁移学习的机器学习有效地识别了这些风险,强调了早期饮食干预对预防儿科人群高血压和支持心血管健康的必要性。