Suppr超能文献

用于埃及五岁以下儿童发育迟缓分类与预测的监督式机器学习

Supervised machine learning for classification and prediction of stunting among under-five Egyptian children.

作者信息

Hendy Abdelaziz, Ibrahim Rasha Kadri, Abdelaliem Sally Mohammed Farghaly, Zaher Ahmed, Alkubati Sameer A, El-Kader Rabab Gad Abd, Hendy Ahmed

机构信息

Pediatric nursing department, Faculty nursing, Ain Shams University, Cairo, Egypt.

Nursing Department, Fatima College of Health Sciences, Al Dhafra region , Madinat Zayed, UAE.

出版信息

BMC Pediatr. 2025 Sep 18;25(1):681. doi: 10.1186/s12887-025-06138-x.

Abstract

INTRODUCTION

Stunting, a significant form of chronic undernutrition, affects millions of children under five worldwide and poses substantial challenges to physical, cognitive, and socioeconomic development—particularly in low- and middle-income countries like Egypt.

AIMS

This study aims to apply and compare the performance of various supervised machine learning (ML) algorithms to classify and predict stunting among Egyptian children under five years old. It also aims to identify key risk factors that contribute to stunting.

METHODS

Data from the Egypt Demographic and Health Surveys (DHS) conducted in 2005, 2008, and 2014 were used. After extensive data cleaning and preprocessing—including handling missing values and addressing class imbalance—five ML classifiers (XGBoost, Logistic Regression, Random Forest, Gradient Boosting, and K-Nearest Neighbors) were trained and evaluated using 10-fold stratified cross-validation, performance metrics included accuracy, precision, recall, F1 score, and ROC-AUC.

RESULTS

Gradient Boosting and Random Forest achieved the highest predictive performance, with accuracy scores exceeding 90% and ROC-AUC values above 0.96. Logistic Regression also performed robustly, while K-Nearest Neighbors showed relatively lower performance due to sensitivity to noise and high-dimensional data Significant predictors of stunting included the child’s nutritional status, maternal education, birth size, wealth index, and rural residence.

CONCLUSION

The application of supervised machine learning, especially with the Gradient Boosting and Random Forest techniques, showed excellent accuracy in predicting stunting in children under five years of age in Egypt. The results of this study highlight the utility of machine learning in identifying vulnerable groups for targeted public health interventions. Further studies are encouraged to utilize more recent data and focus on multi-level feature selection and hyperparameter optimization to improve prediction precision further.

摘要

引言

发育迟缓是慢性营养不良的一种重要形式,影响着全球数百万五岁以下儿童,并对身体、认知和社会经济发展构成重大挑战,在埃及等低收入和中等收入国家尤为如此。

目的

本研究旨在应用并比较各种监督式机器学习(ML)算法的性能,以对埃及五岁以下儿童的发育迟缓进行分类和预测。它还旨在确定导致发育迟缓的关键风险因素。

方法

使用了2005年、2008年和2014年进行的埃及人口与健康调查(DHS)的数据。在进行广泛的数据清理和预处理(包括处理缺失值和解决类别不平衡问题)之后,使用10折分层交叉验证对五个ML分类器(XGBoost、逻辑回归、随机森林、梯度提升和K近邻)进行了训练和评估,性能指标包括准确率、精确率、召回率、F1分数和ROC-AUC。

结果

梯度提升和随机森林取得了最高的预测性能,准确率超过90%,ROC-AUC值高于0.96。逻辑回归也表现强劲,而K近邻由于对噪声和高维数据敏感,表现相对较低。发育迟缓的重要预测因素包括儿童的营养状况、母亲教育程度、出生体重、财富指数和农村居住情况。

结论

监督式机器学习的应用,特别是梯度提升和随机森林技术,在预测埃及五岁以下儿童发育迟缓方面显示出极高的准确率。本研究结果突出了机器学习在识别目标公共卫生干预弱势群体方面的效用。鼓励进一步的研究利用更新的数据,并专注于多层次特征选择和超参数优化,以进一步提高预测精度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/897c/12445022/fe4238e200fe/12887_2025_6138_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验