Hindmarsh Diane, Steel David
Bureau of Health Information, Level 2, 1 Reserve Road St Leonards, NSW, Australia.
National Institute for Applied Statistics Research Australia, University of Wollongong, Wollongong, NSW, Australia.
AIMS Public Health. 2020 Jun 22;7(2):403-424. doi: 10.3934/publichealth.2020034. eCollection 2020.
Regular health surveys can produce reliable estimates at higher geographic levels but not for small areas. Alternatives are to aggregate data over several years or use model-based methods. We created and evaluated model-based estimates for four health-related outcomes by gender, for 153 Local Government Areas using data from the New South Wales Population Health Survey. The evaluation examined evidence on bias and determined the covariates available and appropriate for each outcome variable. The evaluation considered the likely precision of the resulting estimates. The bias and precision of results for single years (2006-2008) for each outcome variable using six covariate specifications were compared with direct survey estimates based on a single year's data and those obtained by aggregating over seven years. A practical issue is how to choose covariates to include in the models as the best covariate specification varies between outcome variables. Model-based results had median root mean squared errors between 3.3% and 5.5% (max 5.2% and 11.3% respectively) and median relative root mean squared errors between 6.8% and 24.5% (max 11.7% and 41.5% respectively). The model-based estimates were unbiased compared with direct estimates based on one or seven years of data and when aggregated to a point where direct estimates were reliable. The bias and reliability assessment process provides a way for policymakers to have confidence in model-based estimates.
定期健康调查可以在较高地理层面得出可靠的估计值,但不适用于小区域。替代方法是对数年的数据进行汇总或使用基于模型的方法。我们利用新南威尔士州人口健康调查的数据,针对153个地方政府区域,按性别创建并评估了与四种健康相关结果基于模型的估计值。该评估审查了偏差证据,并确定了每个结果变量可用且合适的协变量。评估考虑了所得估计值可能的精度。将使用六种协变量规格得出的每个结果变量单年(2006 - 2008年)结果的偏差和精度,与基于单年数据的直接调查估计值以及通过七年汇总获得的估计值进行了比较。一个实际问题是如何选择纳入模型的协变量,因为最佳协变量规格因结果变量而异。基于模型的结果的中位数均方根误差在3.3%至5.5%之间(最大值分别为5.2%和11.3%),中位数相对均方根误差在6.8%至24.5%之间(最大值分别为11.7%和41.5%)。与基于一年或七年数据的直接估计值相比,以及汇总到直接估计值可靠的程度时,基于模型的估计值无偏差。偏差和可靠性评估过程为政策制定者对基于模型的估计值产生信心提供了一种途径。