Suppr超能文献

用于预测东非五价疫苗3接种率下降的堆叠集成机器学习模型。

A stacked ensemble machine learning model for the prediction of pentavalent 3 vaccination dropout in East Africa.

作者信息

Alemayehu Meron Asmamaw, Kebede Shimels Derso, Walle Agmasie Damtew, Mamo Daniel Niguse, Enyew Ermias Bekele, Adem Jibril Bashir

机构信息

Department of Epidemiology and Biostatistics, College of Medicine and Health Sciences, Institute of Public Health, University of Gondar, Gondar, Ethiopia.

Department of Health Informatics, School of Public Health, College of Medicine and Health Science, Wollo University, Dessie, Ethiopia.

出版信息

Front Big Data. 2025 Apr 7;8:1522578. doi: 10.3389/fdata.2025.1522578. eCollection 2025.

Abstract

INTRODUCTION

Vaccination is critical for reducing childhood mortality, yet completion rates for the third dose of the pentavalent vaccine (Penta 3) in East Africa remain inadequate. This study aims to predict Penta 3 vaccination dropout using a stacking ensemble machine learning model with Demographic and Health Survey (DHS) data. The objective is to identify predictors of dropout and enhance intervention strategies.

METHODS

The study utilized seven base machine learning algorithms to create a stacked ensemble model with three meta-learners: Random Forest (RF), Generalized Linear Model (GLM), and Extreme Gradient Boosting (XGBoost). The H2O package facilitated the development of base learners and the stacking of super learners. Feature selection (FS) and comparisons were performed using the LASSO and Boruta algorithms. The selected features were one-hot encoded, and ordinal encoding was applied where appropriate. Hyperparameter optimization (HPO) and comparisons were conducted using grid search and random search. Model performance was assessed using five key metrics, including accuracy and the area under the curve (AUC). SHAP (Shapley Additive Explanations) values were used to interpret the model outputs and identify influential predictors. The experimental design was employed to present the results.

RESULTS

Four experiments were conducted to evaluate feature selection and HPO methods. All stacked ensemble models outperformed individual learners, with the XGBoost meta-learner optimized with grid search and LASSO FS achieving the highest performance: 93.9% accuracy and 99.4% AUC. While RF and GLM meta-learners were also evaluated, they were outperformed by the XGBoost meta-learner. SHAP analysis revealed key features influencing Penta 3 dropout, including the place of delivery, decision-making autonomy, the mother's level of earning, and healthcare access. Home delivery increased the risk of dropout, while postnatal care by midwives and health insurance coverage lowered dropout likelihood.

CONCLUSION AND RECOMMENDATION

This study provides insights into the factors influencing Penta 3 vaccination dropout in East Africa. To reduce dropout rates, interventions should focus on enhancing maternal livelihood opportunities, improving healthcare access in rural areas, and promoting institutional deliveries.

摘要

引言

疫苗接种对于降低儿童死亡率至关重要,但东非五价疫苗第三剂(Penta 3)的接种完成率仍然不足。本研究旨在使用包含人口与健康调查(DHS)数据的堆叠集成机器学习模型预测Penta 3疫苗接种率下降情况。目的是识别接种率下降的预测因素并加强干预策略。

方法

该研究利用七种基础机器学习算法创建了一个堆叠集成模型,该模型有三个元学习器:随机森林(RF)、广义线性模型(GLM)和极端梯度提升(XGBoost)。H2O软件包促进了基础学习器的开发和超级学习器的堆叠。使用LASSO和Boruta算法进行特征选择(FS)和比较。对选定的特征进行独热编码,并在适当的地方应用序数编码。使用网格搜索和随机搜索进行超参数优化(HPO)和比较。使用包括准确率和曲线下面积(AUC)在内的五个关键指标评估模型性能。SHAP(Shapley加性解释)值用于解释模型输出并识别有影响力的预测因素。采用实验设计展示结果。

结果

进行了四项实验来评估特征选择和HPO方法。所有堆叠集成模型的表现均优于单个学习器,使用网格搜索和LASSO FS优化的XGBoost元学习器性能最高:准确率为93.9%,AUC为99.4%。虽然也对RF和GLM元学习器进行了评估,但它们的表现不如XGBoost元学习器。SHAP分析揭示了影响Penta 3疫苗接种率下降的关键特征,包括分娩地点、决策自主权、母亲的收入水平和医疗保健可及性。在家分娩增加了接种率下降的风险,而助产士的产后护理和医疗保险覆盖则降低了接种率下降的可能性。

结论与建议

本研究深入探讨了影响东非Penta 3疫苗接种率下降的因素。为降低接种率下降,干预措施应侧重于增加孕产妇生计机会、改善农村地区的医疗保健可及性以及促进机构分娩。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b8be/12009798/4d6ea9555b7a/fdata-08-1522578-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验