West Brady T, McCabe Sean Esteban
Survey Methodology Program, Survey Research Center, Institute for Social Research, University of Michigan-Ann Arbor, Ann Arbor, MI (
Institute for Research on Women and Gender, Substance Abuse Research Center, University of Michigan-Ann Arbor, Ann Arbor, MI (
Stata J. 2012 Oct 1;12(4):718-725.
This article considers the situation that arises when a survey data producer has collected data from a sample with a complex design (possibly featuring stratification of the population, cluster sampling, and / or unequal probabilities of selection), and for various reasons only provides secondary analysts of those survey data with a final survey weight for each respondent and "average" design effects for survey estimates computed from the data. In general, these "average" design effects, presumably computed by the data producer in a way that fully accounts for all of the complex sampling features, already incorporate possible increases in sampling variance due to the use of the survey weights in estimation. The secondary analyst of the survey data who then 1) uses the provided information to compute weighted estimates, 2) computes design-based standard errors reflecting variance in the weights (using Taylor Series Linearization, for example), and 3) inflates the estimated variances using the "average" design effects provided is applying a "double" adjustment to the standard errors for the effect of weighting on the variance estimates, leading to overly conservative inferences. We propose a simple method for preventing this problem, and provide a Stata program for applying appropriate adjustments to variance estimates in this situation. We illustrate two applications of the method to survey data from the Monitoring the Future (MTF) study, and conclude with suggested directions for future research in this area.
调查数据生产者从一个具有复杂设计的样本(可能包括总体分层、整群抽样和/或不等概率抽样)中收集了数据,并且由于各种原因,仅向这些调查数据的二次分析者提供每个受访者的最终调查权重以及根据数据计算的调查估计值的“平均”设计效应。一般来说,这些“平均”设计效应大概是由数据生产者以一种充分考虑所有复杂抽样特征的方式计算出来的,已经纳入了由于在估计中使用调查权重而可能导致的抽样方差增加。然后,调查数据的二次分析者1)使用所提供的信息来计算加权估计值,2)计算反映权重方差的基于设计的标准误差(例如使用泰勒级数线性化),以及3)使用所提供的“平均”设计效应来扩大估计方差,这样做是对权重对方差估计的影响进行了“双重”调整,从而导致推断过于保守。我们提出了一种简单的方法来防止这个问题,并提供了一个Stata程序,用于在这种情况下对方差估计进行适当的调整。我们说明了该方法在“未来监测”(MTF)研究的调查数据中的两个应用,并最后给出了该领域未来研究的建议方向。