Okeyo Valine Atieno, Orowe Idah, Oguge Nicholas Otienoh
University of Nairobi, Department of Mathematics, Kenya.
University of Nairobi, Center for Advanced Studies in Environmental Law and Policy, Kenya.
Int J Innov Sci Res Technol. 2024 Jul;9(7):3489-3492. doi: 10.38124/ijisrt/ijisrt24jul1521.
This study investigates the predictive capability of a Random Forest model in identifying respiratory diseases attributed to PM2.5 exposure in Nairobi County. Leveraging a comprehensive dataset encompassing demographic and air quality variables, the model demonstrated robust performance metrics, achieving an accuracy of 79.97% and an area under the curve (AUC) of 0.872. These results highlight the model's effectiveness in distinguishing between respiratory and cardiovascular conditions. The model's sensitivity and specificity were 81.88% and 73.27%, respectively, indicating a strong ability to correctly identify both true positives and true negatives. Analysis of feature importance revealed that age and PM2.5 concentrations were the most influential factors in predicting health outcomes, emphasizing the significant impact of air pollution and demographic factors on respiratory and cardiovascular health. Furthermore, the consistent train and test error rates across varying training set sizes suggest the model's stability and generalizability. This study underscores the importance of addressing air quality issues to mitigate the health impacts of PM2.5 exposure in urban settings.
本研究调查了随机森林模型在识别内罗毕县因接触细颗粒物(PM2.5)而导致的呼吸道疾病方面的预测能力。该模型利用包含人口统计和空气质量变量的综合数据集,展现出稳健的性能指标,准确率达到79.97%,曲线下面积(AUC)为0.872。这些结果凸显了该模型在区分呼吸道疾病和心血管疾病方面的有效性。该模型的灵敏度和特异度分别为81.88%和73.27%,表明其在正确识别真阳性和真阴性方面能力较强。特征重要性分析显示,年龄和PM2.5浓度是预测健康结果的最具影响力因素,强调了空气污染和人口因素对呼吸道和心血管健康的重大影响。此外,不同训练集规模下一致的训练误差率和测试误差率表明该模型具有稳定性和通用性。本研究强调了解决空气质量问题以减轻城市环境中PM2.5暴露对健康影响的重要性。