Franco Carolina, Little Roderick J A, Louis Thomas A, Slud Eric V
Center for Statistical Research and Methodology (CSRM), US Census Bureau, 4600 Silver Hill Road, Washington DC 20233, USA.
Department of Biostatistics, University of Michigan, 1415 Washington Heights, Ann Arbor, MI 48109, USA.
J Surv Stat Methodol. 2019 Sep;7(3):334-364. doi: 10.1093/jssam/smy019. Epub 2019 Jan 7.
The most widespread method of computing confidence intervals (CIs) in complex surveys is to add and subtract the margin of error (MOE) from the point estimate, where the MOE is the estimated standard error multiplied by the suitable Gaussian quantile. This Wald-type interval is used by the American Community Survey (ACS), the largest US household sample survey. For inferences on small proportions with moderate sample sizes, this method often results in marked under-coverage and lower CI endpoint less than 0. We assess via simulation the coverage and width, in complex sample surveys, of seven alternatives to the Wald interval for a binomial proportion with sample size replaced by the 'effective sample size,' that is, the sample size divided by the design effect. Building on previous work by the present authors, our simulations address the impact of clustering, stratification, different stratum sampling fractions, and stratum-specific proportions. We show that all intervals undercover when there is clustering and design effects are computed from a simple design-based estimator of sampling variance. Coverage can be better calibrated for the alternatives to Wald by improving estimation of the effective sample size through superpopulation modeling. This approach is more effective in our simulations than previously proposed modifications of effective sample size. We recommend intervals of the Wilson or Bayes uniform prior form, with the Jeffreys prior interval not far behind.
在复杂抽样调查中,计算置信区间(CI)最普遍的方法是从点估计值中加减误差幅度(MOE),其中MOE是估计标准误差乘以合适的高斯分位数。美国最大的家庭抽样调查——美国社区调查(ACS)采用这种沃尔德型区间。对于中等样本量下小比例的推断,这种方法常常导致明显的覆盖不足,且置信区间下限小于0。我们通过模拟评估了在复杂样本调查中,用“有效样本量”(即样本量除以设计效应)代替样本量的二项比例的沃尔德区间的七种替代方法的覆盖范围和宽度。基于作者之前的工作,我们的模拟研究了聚类、分层、不同层抽样比例以及层特定比例的影响。我们表明,当存在聚类且抽样方差的设计估计量是简单的基于设计的估计量时,所有区间都会出现覆盖不足的情况。通过超总体建模改进有效样本量的估计,可以更好地校准沃尔德区间替代方法的覆盖范围。在我们的模拟中,这种方法比之前提出的有效样本量修正方法更有效。我们推荐威尔逊或贝叶斯均匀先验形式的区间,杰弗里斯先验区间紧随其后。