Department of Mathematical Sciences, School of Natural and Applied Sciences, University of Malawi, Zomba, Malawi.
National Statistical Office of Malawi, Zomba, Malawi.
BMC Public Health. 2023 Aug 31;23(1):1674. doi: 10.1186/s12889-023-16544-4.
The birth and death rates of a population are among the crucial vital statistics for socio-economic policy planning in any country. Since the under-five mortality rate is one of the indicators for monitoring the health of a population, it requires regular and accurate estimation. The national demographic and health survey data, that are readily available to the puplic, have become a means for answering most health-related questions among African populations, using relevant statistical methods. However, many of such applications tend to ignore survey design effect in the estimations, despite the availability of statistical tools that support the analyses. Little is known about the amount of inaccurate information that is generated when predicting under-five mortality rates. This study estimates and compares the bias encountered when applying unweighted and weighted logistic regression methods to predict under-five mortality rate in Malawi using nationwide survey data. The Malawi demographic and health survey data of 2004, 2010, and 2015-16 were used to determine the bias. The analyses were carried out in R software version 3.6.3 and Stata version 12.0. A logistic regression model that included various bio- and socio-demographic factors concerning the child, mother and households was used to estimate the under-five mortality rate. The results showed that accuracy of predicting the national under-five mortality rate hinges on cluster-weighting of the overall predicted probability of child-deaths, regardless of whether the model was weighted or not. Weighting the model caused small positive and negative changes in various fixed-effect estimates, which diffused the result of weighting in the fitted probabilities of deaths. In turn, there was no difference between the overall predicted mortality rate obtained using the weighted model and that obtained in the unweighted model. We recommend considering survey cluster-weights during the computation of overall predicted probability of events for a binary health outcome. This can be done without worrying about the weights during model fitting, whose aim is prediction of the population parameter.
人口的出生率和死亡率是任何国家社会经济政策规划的关键生命统计数据之一。由于五岁以下儿童死亡率是监测人口健康的指标之一,因此需要定期、准确地估计。国家人口和健康调查数据对公众是可用的,已成为使用相关统计方法回答非洲人口中大多数与健康相关问题的一种手段。然而,许多此类应用在估计中往往忽略了调查设计效果,尽管有支持分析的统计工具。对于预测五岁以下儿童死亡率时产生的不准确信息的数量,人们知之甚少。本研究使用全国范围的调查数据,估计并比较了在马拉维应用未加权和加权逻辑回归方法预测五岁以下儿童死亡率时遇到的偏差。使用 2004 年、2010 年和 2015-16 年的马拉维人口和健康调查数据来确定偏差。分析在 R 软件版本 3.6.3 和 Stata 版本 12.0 中进行。使用包括有关儿童、母亲和家庭的各种生物和社会人口因素的逻辑回归模型来估计五岁以下儿童死亡率。结果表明,准确预测全国五岁以下儿童死亡率取决于对总体儿童死亡概率的聚类加权,而不管模型是否加权。加权模型会导致各种固定效应估计值的微小正负变化,从而使加权结果在死亡拟合概率中扩散。反过来,使用加权模型获得的总体预测死亡率与未加权模型获得的死亡率没有差异。我们建议在计算二项式健康结果的总体预测事件概率时考虑调查聚类权重。这可以在不担心模型拟合期间权重的情况下完成,模型拟合的目的是预测人口参数。