Department Health informatics, Institute of Public Health, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia.
Department of Epidemiology and Biostatistics, Institute of Public Health, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia.
J Epidemiol Glob Health. 2024 Sep;14(3):1089-1099. doi: 10.1007/s44197-024-00259-9. Epub 2024 Jul 29.
BACKGROUND: The second most common cause of death for children under five is diarrhea. Early Predicting diarrhea disease and identify its determinants (factors) using an advanced machine learning model is the most effective way to save the lives of children. Hence, this study aimed to predict diarrheal diseases, identify their determinants, and generate some rules using machine learning models. METHODS: The study used secondary data from the 12 east African countries for DHS dataset analysis using Python. Machine learning techniques such as Random Forest, Decision Tree (DT), K-Nearest Neighbor, Logistic Regression (LR), wrapper feature selection and SHAP values are used for identify determinants. RESULT: The final experimentation results indicated the random forest model performed the best to predict diarrhea disease with an accuracy of 86.5%, precision of 89%, F-measure of 86%, AUC curve of 92%, and recall of 82%. Important predictors' identified age, countries, wealth status, mother's educational status, mother's age, source of drinking water, number of under-five children immunization status, media exposure, timing of breast feeding, mother's working status, types of toilet, and twin status were associated with a higher predicted probability of diarrhea disease. CONCLUSION: According to this study, child caregivers are fully aware of sanitation and feeding their children, and moms are educated, which can reduce child mortality by diarrhea in children in east Africa. This leads to a recommendation for policy direction to reduce infant mortality in East Africa.
背景:五岁以下儿童死亡的第二大常见原因是腹泻。早期预测腹泻病并使用先进的机器学习模型识别其决定因素(因素)是挽救儿童生命的最有效方法。因此,本研究旨在使用机器学习模型预测腹泻病,识别其决定因素,并生成一些规则。
方法:该研究使用来自 12 个东非国家的二次数据进行 DHS 数据集分析,使用 Python 进行分析。使用了机器学习技术,如随机森林、决策树(DT)、K-最近邻、逻辑回归(LR)、包装特征选择和 SHAP 值,用于识别决定因素。
结果:最终的实验结果表明,随机森林模型在预测腹泻病方面表现最佳,准确率为 86.5%,精度为 89%,F 度量为 86%,AUC 曲线为 92%,召回率为 82%。确定的重要预测因素包括年龄、国家、财富状况、母亲的教育程度、母亲的年龄、饮用水来源、五岁以下儿童的免疫接种状况、媒体接触、母乳喂养时间、母亲的工作状况、厕所类型和双胞胎状况与更高的腹泻病预测概率相关。
结论:根据这项研究,儿童照顾者充分意识到卫生和喂养他们的孩子,并且妈妈们受过教育,这可以降低东非儿童因腹泻导致的死亡率。这导致建议制定政策方向,以降低东非的婴儿死亡率。
J Big Data. 2021
Arch Public Health. 2021-7-6
Korean J Parasitol. 2021-2