Jeddi Zineb, Gryech Ihsane, Ghogho Mounir, El Hammoumi Maryame, Mahraoui Chafiq
TICLab, College of Engineering & Architecture, International University of Rabat, Rabat 11103, Morocco.
ENSIAS, Mohammed V University in Rabat, Rabat 10000, Morocco.
Healthcare (Basel). 2021 Oct 29;9(11):1464. doi: 10.3390/healthcare9111464.
The prevalence rate for childhood asthma and its associated risk factors vary significantly across countries and regions. In the case of Morocco, the scarcity of available medical data makes scientific research on diseases such as asthma very challenging. In this paper, we build machine learning models to predict the occurrence of childhood asthma using data from a prospective study of 202 children with and without asthma. The association between different factors and asthma diagnosis is first assessed using a Chi-squared test. Then, predictive models such as logistic regression analysis, decision trees, random forest and support vector machine are used to explore the relationship between childhood asthma and the various risk factors. First, data were pre-processed using a Chi-squared feature selection, 19 out of the 36 factors were found to be significantly associated (-value < 0.05) with childhood asthma; these include: history of atopic diseases in the family, presence of mites, cold air, strong odors and mold in the child's environment, mode of birth, breastfeeding and early life habits and exposures. For asthma prediction, random forest yielded the best predictive performance (accuracy = 84.9%), followed by logistic regression (accuracy = 82.57%), support vector machine (accuracy = 82.5%) and decision trees (accuracy = 75.19%). The decision tree model has the advantage of being easily interpreted. This study identified important maternal and prenatal risk factors for childhood asthma, the majority of which are avoidable. Appropriate steps are needed to raise awareness about the prenatal risk factors.
儿童哮喘的患病率及其相关风险因素在不同国家和地区存在显著差异。就摩洛哥而言,可用医学数据的匮乏使得对哮喘等疾病的科学研究极具挑战性。在本文中,我们利用对202名患哮喘和未患哮喘儿童的前瞻性研究数据构建机器学习模型,以预测儿童哮喘的发生情况。首先使用卡方检验评估不同因素与哮喘诊断之间的关联。然后,使用逻辑回归分析、决策树、随机森林和支持向量机等预测模型来探究儿童哮喘与各种风险因素之间的关系。首先,使用卡方特征选择对数据进行预处理,发现36个因素中有19个与儿童哮喘显著相关(p值<0.05);这些因素包括:家族特应性疾病史、儿童生活环境中存在螨虫、冷空气、强烈气味和霉菌、出生方式、母乳喂养以及早期生活习惯和接触情况。对于哮喘预测,随机森林的预测性能最佳(准确率 = 84.9%),其次是逻辑回归(准确率 = 82.57%)、支持向量机(准确率 = 82.5%)和决策树(准确率 = 75.19%)。决策树模型具有易于解释的优点。本研究确定了儿童哮喘重要的母体和产前风险因素,其中大多数是可以避免的。需要采取适当措施提高对产前风险因素的认识。