Yilema Seyifemickael Amare, Shiferaw Yegnanew A, Moyehodie Yikeber Abebaw, Fenta Setegn Muche, Belay Denekew Bitew, Fenta Haile Mekonnen, Nigussie Teshager Zerihun, Chen Ding-Geng
Department of Statistics, Debre Tabor University, Debre Tabor, Ethiopia.
Department of Statistics, University of Pretoria, Pretoria, South Africa.
Front Public Health. 2025 Jul 18;13:1549210. doi: 10.3389/fpubh.2025.1549210. eCollection 2025.
Community-based health insurance (CBHI) is a vital tool for achieving universal health coverage (UHC), a key global health priority outlined in the sustainable development goals (SDGs). Sub-Saharan Africa continues to face challenges in achieving UHC and protecting individuals from the financial burden of disease. As a result, CBHI has become popular in low- and middle-income countries, including Ethiopia. Therefore, this study aimed to identify the ML algorithm with the best predictive accuracy for CBHI enrollment and to determine the most influential predictors among the dataset.
The 2019 Ethiopian Mini Demographic and Health Survey (EMDHS) data were used. The CBHI were predicted using seven machine learning models: linear discriminant analysis (LDA), support vector machine with radial basis function (SVM), k-nearest neighbors (KNN), classification and regression tree (CART), and random forest (RF). Receiver operating characteristic curves and other metrics were used to evaluate each model's accuracy.
The RF algorithm was determined to be the best machine learning model based on different performance assessments. The result indicates that age, wealth index, household members, and land usage all significantly affect CBHI in Ethiopia.
This study found that RF machine learning models could improve the ability to classify CBHI in Ethiopia with high accuracy. Age, wealth index, household members, and land utilization are some of the most significant variables associated with CBHI that were determined by feature importance. The results of the study can help health professionals and policymakers create focused strategies to improve CBHI enrollment in Ethiopia.
基于社区的健康保险(CBHI)是实现全民健康覆盖(UHC)的重要工具,全民健康覆盖是可持续发展目标(SDGs)中概述的一项关键全球卫生重点。撒哈拉以南非洲在实现全民健康覆盖和保护个人免受疾病经济负担方面继续面临挑战。因此,基于社区的健康保险在包括埃塞俄比亚在内的低收入和中等收入国家中变得很受欢迎。因此,本研究旨在确定对基于社区的健康保险参保具有最佳预测准确性的机器学习算法,并确定数据集中最具影响力的预测因素。
使用了2019年埃塞俄比亚微型人口与健康调查(EMDHS)数据。使用七种机器学习模型预测基于社区的健康保险参保情况:线性判别分析(LDA)、径向基函数支持向量机(SVM)、k近邻(KNN)、分类与回归树(CART)和随机森林(RF)。使用受试者工作特征曲线和其他指标来评估每个模型的准确性。
根据不同的性能评估,确定随机森林算法是最佳的机器学习模型。结果表明,年龄、财富指数、家庭成员数量和土地使用情况均对埃塞俄比亚的基于社区的健康保险参保情况有显著影响。
本研究发现,随机森林机器学习模型可以提高埃塞俄比亚基于社区的健康保险参保情况的分类能力,且准确性较高。年龄、财富指数、家庭成员数量和土地利用情况是通过特征重要性确定的与基于社区的健康保险相关的一些最重要变量。该研究结果可帮助卫生专业人员和政策制定者制定有针对性的策略,以提高埃塞俄比亚基于社区的健康保险参保率。