Zemariam Alemu Birara, Abate Biruk Beletew, Alamaw Addis Wondmagegn, Lake Eyob Shitie, Yilak Gizachew, Ayele Mulat, Tilahun Befkad Derese, Ngusie Habtamu Setegn
Department of Pediatrics and Child Health Nursing, School of Nursing, College of Medicine and Health Science, Woldia University, Woldia, Ethiopia.
Department of Emergency and Critical Care Nursing, School of Nursing, College of Medicine and Health Science, Woldia University, Woldia, Ethiopia.
PLoS One. 2025 Jan 24;20(1):e0316452. doi: 10.1371/journal.pone.0316452. eCollection 2025.
Stunting is a vital indicator of chronic undernutrition that reveals a failure to reach linear growth. Investigating growth and nutrition status during adolescence, in addition to infancy and childhood is very crucial. However, the available studies in Ethiopia have been usually focused in early childhood and they used the traditional stastical methods. Therefore, this study aimed to employ multiple machine learning algorithms to identify the most effective model for the prediction of stunting among adolescent girls in Ethiopia.
A total of 3156 weighted samples of adolescent girls aged 15-19 years were used from the 2016 Ethiopian Demographic and Health Survey dataset. The data was pre-processed, and 80% and 20% of the observations were used for training, and testing the model, respectively. Eight machine learning algorithms were included for consideration of model building and comparison. The performance of the predictive model was evaluated using evaluation metrics value through Python software. The synthetic minority oversampling technique was used for data balancing and Boruta algorithm was used to identify best features. Association rule mining using an Apriori algorithm was employed to generate the best rule for the association between the independent feature and the targeted feature using R software.
The random forest classifier (sensitivity = 81%, accuracy = 77%, precision = 75%, f1-score = 78%, AUC = 85%) outperformed in predicting stunting compared to other ML algorithms considered in this study. Region, poor wealth index, no formal education, unimproved toilet facility, rural residence, not used contraceptive method, religion, age, no media exposure, occupation, and having one or more children were the top attributes to predict stunting. Association rule mining was identified the top seven best rules that most frequently associated with stunting among adolescent girls in Ethiopia.
The random forest classifier outperformed in predicting and identifying the relevant predictors of stunting. Results have shown that machine learning algorithms can accurately predict stunting, making them potentially valuable as decision-support tools for the relevant stakeholders and giving emphasis for the identified predictors could be an important intervention to halt stunting among adolescent girls.
发育迟缓是慢性营养不良的一项重要指标,表明未能实现线性生长。除了婴儿期和儿童期外,调查青少年时期的生长和营养状况也非常关键。然而,埃塞俄比亚现有的研究通常集中在幼儿期,并且使用的是传统统计方法。因此,本研究旨在采用多种机器学习算法,以确定预测埃塞俄比亚青春期女孩发育迟缓的最有效模型。
从2016年埃塞俄比亚人口与健康调查数据集选取了总共3156个15至19岁青春期女孩的加权样本。对数据进行了预处理,分别使用80%和20%的观测值来训练和测试模型。考虑了八种机器学习算法用于模型构建和比较。通过Python软件使用评估指标值来评估预测模型的性能。使用合成少数过采样技术进行数据平衡,并使用Boruta算法识别最佳特征。使用R软件通过Apriori算法进行关联规则挖掘,以生成独立特征与目标特征之间关联的最佳规则。
与本研究中考虑的其他机器学习算法相比,随机森林分类器(灵敏度=81%,准确率=77%,精确率=75%,F1分数=78%,AUC=85%)在预测发育迟缓方面表现更优。地区、贫困财富指数、未接受正规教育、卫生设施未改善、农村居住、未使用避孕方法、宗教信仰、年龄、无媒体接触、职业以及育有一个或多个子女是预测发育迟缓的首要属性。关联规则挖掘确定了埃塞俄比亚青春期女孩中与发育迟缓最常相关的前七条最佳规则。
随机森林分类器在预测和识别发育迟缓的相关预测因子方面表现更优。结果表明,机器学习算法可以准确预测发育迟缓,使其有可能成为相关利益攸关方有价值的决策支持工具,并且重视已确定的预测因子可能是阻止青春期女孩发育迟缓的一项重要干预措施。