Elliott Michael R
Department of Biostatistics, School of Public Health, University of Michigan, 1420 Washington Heights, Ann Arbor, MI 48109, USA.
J Off Stat. 2008 Dec 1;24(4):517-540.
In sample surveys where sampled units have unequal probabilities of inclusion, associations between the inclusion probabilities and the statistic of interest can induce bias. Weights equal to the inverse of the probability of inclusion are often used to counteract this bias. Highly disproportional sample designs have highly variable weights, which can introduce undesirable variability in statistics such as the population mean or linear regression estimates. Weight trimming reduces large weights to a fixed maximum value, reducing variability but introducing bias. Most standard approaches are ad-hoc in that they do not use the data to optimize bias-variance tradeoffs. This manuscript develops variable selection models, termed "weight pooling" models, that extend weight trimming procedures in a Bayesian model averaging framework to produce "data driven" weight trimming estimators. We develop robust yet efficient models that approximate fully-weighted estimators when bias correction is of greatest importance, and approximate unweighted estimators when variance reduction is critical.
在抽样概率不等的样本调查中,入选概率与感兴趣的统计量之间的关联可能会导致偏差。通常使用等于入选概率倒数的权重来抵消这种偏差。高度不成比例的样本设计具有高度可变的权重,这可能会在诸如总体均值或线性回归估计等统计量中引入不良的变异性。权重修剪将大权重减小到固定的最大值,从而降低变异性,但会引入偏差。大多数标准方法都是临时的,因为它们不利用数据来优化偏差-方差权衡。本文稿开发了变量选择模型,称为“权重合并”模型,该模型在贝叶斯模型平均框架中扩展了权重修剪程序,以生成“数据驱动”的权重修剪估计量。我们开发了稳健而有效的模型,在偏差校正最为重要时近似完全加权估计量,在方差缩减至关重要时近似无加权估计量。