Statistics and Applied Mathematics Sciences Institute, Durham, NC 27709, U.S.A.
Stat Med. 2014 Sep 28;33(22):3932-45. doi: 10.1002/sim.6219. Epub 2014 Jun 9.
The National Health Interview Survey, conducted by the National Center for Health Statistics, is designed to provide reliable design-based estimates for a wide range of health-related variables for national and four major geographical regions of the USA. However, state-level or substate-level estimates are likely to be unreliable because they are based on small sample sizes. In this paper, we compare the efficiency of different area-level models in estimating smoking prevalence for the 50 US states and the District of Columbia. Our study is based on survey data from the 2008 National Health Interview Survey in conjunction with a number of potentially related auxiliary variables obtained from the American Community Survey, an ongoing large complex survey conducted by the US Census. A major portion of this study is devoted to the investigation of several methods for estimating survey sampling variances needed to implement an area-level hierarchical model. Based on our findings, a hierarchical Bayesian method that uses a survey-adjusted random sampling variance model to capture the complex survey sampling variability appears to be somewhat superior to the other considered area-level models in accounting for small sample behavior of estimated survey sampling variances. Several diagnostic procedures are presented to compare the proposed methods.
美国国家卫生访谈调查由国家卫生统计中心开展,旨在为美国全国和四大地区的广泛健康相关变量提供可靠的基于设计的估计。然而,州级或州以下级别的估计可能不可靠,因为它们基于小样本量。本文比较了不同区域水平模型在估计美国 50 个州和哥伦比亚特区吸烟流行率方面的效率。我们的研究基于 2008 年全国健康访谈调查的数据,并结合了一些可能相关的辅助变量,这些变量来自美国人口普查局正在进行的一项大型复杂调查——美国社区调查。本研究的主要部分致力于调查几种估计实施区域水平层次模型所需的调查抽样方差的方法。根据我们的发现,一种层次贝叶斯方法使用调查调整的随机抽样方差模型来捕捉复杂的调查抽样变异性,似乎在解释估计调查抽样方差的小样本行为方面略优于其他考虑的区域水平模型。提出了几种诊断程序来比较所提出的方法。