Discipline of Public Health Medicine, School of Nursing and Public Health College of Health Sciences, University of KwaZulu-Natal, Durban, South Africa.
Department of Statistics, College of Science, Bahir Dar University, Bahir Dar, Ethiopia.
Sci Rep. 2024 Jul 9;14(1):15801. doi: 10.1038/s41598-024-65620-1.
Symptoms of Acute Respiratory infections (ARIs) among under-five children are a global health challenge. We aimed to train and evaluate ten machine learning (ML) classification approaches in predicting symptoms of ARIs reported by mothers among children younger than 5 years in sub-Saharan African (sSA) countries. We used the most recent (2012-2022) nationally representative Demographic and Health Surveys data of 33 sSA countries. The air pollution covariates such as global annual surface particulate matter (PM 2.5) and the nitrogen dioxide available in the form of raster images were obtained from the National Aeronautics and Space Administration (NASA). The MLA was used for predicting the symptoms of ARIs among under-five children. We randomly split the dataset into two, 80% was used to train the model, and the remaining 20% was used to test the trained model. Model performance was evaluated using sensitivity, specificity, accuracy, and the area under the receiver operating characteristic curve. A total of 327,507 under-five children were included in the study. About 7.10, 4.19, 20.61, and 21.02% of children reported symptoms of ARI, Severe ARI, cough, and fever in the 2 weeks preceding the survey years respectively. The prevalence of ARI was highest in Mozambique (15.3%), Uganda (15.05%), Togo (14.27%), and Namibia (13.65%,), whereas Uganda (40.10%), Burundi (38.18%), Zimbabwe (36.95%), and Namibia (31.2%) had the highest prevalence of cough. The results of the random forest plot revealed that spatial locations (longitude, latitude), particulate matter, land surface temperature, nitrogen dioxide, and the number of cattle in the houses are the most important features in predicting the diagnosis of symptoms of ARIs among under-five children in sSA. The RF algorithm was selected as the best ML model (AUC = 0.77, Accuracy = 0.72) to predict the symptoms of ARIs among children under five. The MLA performed well in predicting the symptoms of ARIs and associated predictors among under-five children across the sSA countries. Random forest MLA was identified as the best classifier to be employed for the prediction of the symptoms of ARI among under-five children.
五岁以下儿童急性呼吸道感染(ARI)症状是全球健康挑战。我们旨在培训和评估十种机器学习(ML)分类方法,以预测撒哈拉以南非洲(sSA)国家 5 岁以下儿童母亲报告的 ARI 症状。我们使用了来自 33 个 sSA 国家的最新(2012-2022 年)全国代表性人口与健康调查数据。空气污染协变量,如全球年度地面颗粒物(PM 2.5)和以光栅图像形式提供的二氧化氮,是从美国国家航空航天局(NASA)获得的。MLA 用于预测五岁以下儿童的 ARI 症状。我们将数据集随机分为两部分,80%用于训练模型,其余 20%用于测试训练后的模型。使用敏感性、特异性、准确性和接收器操作特征曲线下的面积来评估模型性能。共有 327507 名五岁以下儿童纳入研究。在调查年份前两周,分别有 7.10%、4.19%、20.61%和 21.02%的儿童报告有 ARI、严重 ARI、咳嗽和发烧症状。莫桑比克(15.3%)、乌干达(15.05%)、多哥(14.27%)和纳米比亚(13.65%)的 ARI 患病率最高,而乌干达(40.10%)、布隆迪(38.18%)、津巴布韦(36.95%)和纳米比亚(31.2%)的咳嗽患病率最高。随机森林图的结果表明,空间位置(经度、纬度)、颗粒物、地表温度、二氧化氮和房屋中的牛数量是预测 sSA 地区五岁以下儿童 ARI 诊断的最重要特征。随机森林算法被选为预测 sSA 地区五岁以下儿童 ARI 症状的最佳 ML 模型(AUC=0.77,准确性=0.72)。MLA 在预测 sSA 国家五岁以下儿童的 ARI 症状和相关预测因素方面表现良好。随机森林 MLA 被确定为预测五岁以下儿童 ARI 症状的最佳分类器。