Department of Mathematical Sciences, University of Malawi, Zomba, Malawi.
Department of Statistics, University of Pretoria, Pretoria, South Africa.
BMC Med Res Methodol. 2024 Aug 8;24(1):175. doi: 10.1186/s12874-024-02283-6.
Childhood stunting is a major indicator of child malnutrition and a focus area of Global Nutrition Targets for 2025 and Sustainable Development Goals. Risk factors for childhood stunting are well studied and well known and could be used in a risk prediction model for assessing whether a child is stunted or not. However, the selection of child stunting predictor variables is a critical step in the development and performance of any such prediction model. This paper compares the performance of child stunting diagnostic predictive models based on predictor variables selected using a set of variable selection methods.
Firstly, we conducted a subjective review of the literature to identify determinants of child stunting in Sub-Saharan Africa. Secondly, a multivariate logistic regression model of child stunting was fitted using the identified predictors on stunting data among children aged 0-59 months in the Malawi Demographic Health Survey (MDHS 2015-16) data. Thirdly, several reduced multivariable logistic regression models were fitted depending on the predictor variables selected using seven variable selection algorithms, namely backward, forward, stepwise, random forest, Least Absolute Shrinkage and Selection Operator (LASSO), and judgmental. Lastly, for each reduced model, a diagnostic predictive model for the childhood stunting risk score, defined as the child propensity score based on derived coefficients, was calculated for each child. The prediction risk models were assessed using discrimination measures, including area under-receiver operator curve (AUROC), sensitivity and specificity.
The review identified 68 predictor variables of child stunting, of which 27 were available in the MDHS 2016-16 data. The common risk factors selected by all the variable selection models include household wealth index, age of the child, household size, type of birth (singleton/multiple births), and birth weight. The best cut-off point on the child stunting risk prediction model was 0.37 based on risk factors determined by the judgmental variable selection method. The model's accuracy was estimated with an AUROC value of 64% (95% CI: 60%-67%) in the test data. For children residing in urban areas, the corresponding AUROC was AUC = 67% (95% CI: 58-76%), as opposed to those in rural areas, AUC = 63% (95% CI: 59-67%).
The derived child stunting diagnostic prediction model could be useful as a first screening tool to identify children more likely to be stunted. The identified children could then receive necessary nutritional interventions.
儿童发育迟缓是儿童营养不良的一个主要指标,也是 2025 年全球营养目标和可持续发展目标的重点关注领域。儿童发育迟缓的风险因素已得到充分研究和认识,可以用于评估儿童是否发育迟缓的风险预测模型。然而,选择儿童发育迟缓的预测变量是开发和评估任何此类预测模型的关键步骤。本文比较了基于使用一组变量选择方法选择的预测变量的儿童发育迟缓诊断预测模型的性能。
首先,我们对文献进行了主观综述,以确定撒哈拉以南非洲儿童发育迟缓的决定因素。其次,我们在马拉维 2015-16 年人口与健康调查(MDHS)数据中,使用确定的发育迟缓预测因子对 0-59 个月儿童的发育迟缓数据拟合了多元逻辑回归模型。第三,根据使用七种变量选择算法(后向、前向、逐步、随机森林、最小绝对值收缩和选择算子(LASSO)和判断)选择的预测变量,拟合了几个简化的多变量逻辑回归模型。最后,对于每个简化模型,根据导出系数为每个儿童计算了儿童发育迟缓风险评分(定义为儿童倾向评分)的诊断预测模型。使用判别测量(包括接收器工作曲线下面积(AUROC)、敏感性和特异性)评估预测风险模型。
综述确定了 68 个儿童发育迟缓的预测变量,其中 27 个变量在 MDHS 2016-16 数据中可用。所有变量选择模型都选择的常见风险因素包括家庭财富指数、儿童年龄、家庭规模、分娩类型(单胎/多胎)和出生体重。基于判断变量选择方法确定的风险因素,儿童发育迟缓风险预测模型的最佳截断点为 0.37。在测试数据中,该模型的准确性估计为 AUROC 值为 64%(95%CI:60%-67%)。对于居住在城市地区的儿童,相应的 AUROC 为 AUC=67%(95%CI:58-76%),而对于居住在农村地区的儿童,AUC=63%(95%CI:59-67%)。
所得到的儿童发育迟缓诊断预测模型可用作识别更有可能发育迟缓的儿童的初步筛查工具。然后可以对确定的儿童进行必要的营养干预。