Health Research Institute, Faculty of Health, University of Canberra, Canberra, Australia.
Department of Climate and Envirnoment Health, Biomedical Research Foundation, Dhaka, Bangladesh.
Inform Health Soc Care. 2021 Dec 2;46(4):425-442. doi: 10.1080/17538157.2021.1904938. Epub 2021 Apr 14.
Childhood stunting is a serious public health concern in Bangladesh. Earlier research used conventional statistical methods to identify the risk factors of stunting, and very little is known about the applications and usefulness of machine learning (ML) methods that can identify the risk factors of various health conditions based on complex data. This research evaluates the performance of ML methods in predicting stunting among under-5 aged children using 2014 Bangladesh Demographic and Health Survey data. Besides, this paper identifies variables which are important to predict stunting in Bangladesh. Among the selected ML methods, gradient boosting provides the smallest misclassification error in predicting stunting, followed by random forests, support vector machines, classification tree and logistic regression with forward-stepwise selection. The top 10 important variables (in order of importance) that better predict childhood stunting in Bangladesh are child age, wealth index, maternal education, preceding birth interval, paternal education, division, household size, maternal age at first birth, maternal nutritional status, and parental age. Our study shows that ML can support the building of prediction models and emphasizes on the demographic, socioeconomic, nutritional and environmental factors to understand stunting in Bangladesh.
孟加拉国儿童发育迟缓是一个严重的公共卫生问题。早期的研究使用传统的统计方法来确定发育迟缓的风险因素,而对于可以根据复杂数据识别各种健康状况风险因素的机器学习 (ML) 方法的应用和实用性知之甚少。本研究使用 2014 年孟加拉国人口与健康调查数据评估了 ML 方法在预测 5 岁以下儿童发育迟缓方面的性能。此外,本文还确定了在孟加拉国预测发育迟缓的重要变量。在所选择的 ML 方法中,梯度提升在预测发育迟缓方面的错误分类最小,其次是随机森林、支持向量机、分类树和具有前向逐步选择的逻辑回归。更好地预测孟加拉国儿童发育迟缓的前 10 个重要变量(按重要性排序)依次为儿童年龄、财富指数、母亲教育、前次生育间隔、父亲教育、分区、家庭规模、母亲初育年龄、母亲营养状况和父母年龄。我们的研究表明,ML 可以支持预测模型的构建,并强调人口、社会经济、营养和环境因素,以了解孟加拉国的发育迟缓问题。