Statistics Discipline, Khulna University, Khulna, Bangladesh.
Department of Statistics, Jatiya Kabi Kazi Nazrul Islam University, Mymensingh, Bangladesh.
PLoS One. 2022 May 26;17(5):e0267190. doi: 10.1371/journal.pone.0267190. eCollection 2022.
Low birth weight is one of the primary causes of child mortality and several diseases of future life in developing countries, especially in Southern Asia. The main objective of this study is to determine the risk factors of low birth weight and predict low birth weight babies based on machine learning algorithms.
Low birth weight data has been taken from the Bangladesh Demographic and Health Survey, 2017-18, which had 2351 respondents. The risk factors associated with low birth weight were investigated using binary logistic regression. Two machine learning-based classifiers (logistic regression and decision tree) were adopted to characterize and predict low birth weight. The model performances were evaluated by accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and area under the curve.
The average percentage of low birth weight in Bangladesh was 16.2%. The respondent's region, education, wealth index, height, twin child, and alive child were statistically significant risk factors for low birth weight babies. The logistic regression-based classifier performed 87.6% accuracy and 0.59 area under the curve for holdout (90:10) cross-validation, whereas the decision tree performed 85.4% accuracy and 0.55 area under the curve.
Logistic regression-based classifier provided the most accurate classification of low birth weight babies and has the highest accuracy. This study's findings indicate the necessity for an efficient, cost-effective, and integrated complementary approach to reduce and correctly predict low birth weight babies in Bangladesh.
低出生体重是发展中国家(尤其是南亚国家)儿童死亡和未来多种疾病的主要原因之一。本研究的主要目的是确定低出生体重的风险因素,并基于机器学习算法预测低出生体重儿。
本研究的数据来自 2017-18 年孟加拉国人口与健康调查,共 2351 名受访者。使用二元逻辑回归分析与低出生体重相关的风险因素。采用两种基于机器学习的分类器(逻辑回归和决策树)对低出生体重进行特征描述和预测。采用准确率、敏感度、特异度、阳性预测值、阴性预测值和曲线下面积评估模型性能。
孟加拉国低出生体重儿的平均比例为 16.2%。受访者所在地区、教育程度、财富指数、身高、双胞胎和存活子女是低出生体重儿的统计学显著风险因素。基于逻辑回归的分类器在 90:10 交叉验证中实现了 87.6%的准确率和 0.59 的曲线下面积,而决策树实现了 85.4%的准确率和 0.55 的曲线下面积。
基于逻辑回归的分类器可对低出生体重儿进行最准确的分类,准确率最高。本研究结果表明,在孟加拉国需要采取一种高效、经济且综合的互补方法来降低和正确预测低出生体重儿。