Andridge Rebecca R, West Brady T, Little Roderick J A, Boonstra Philip S, Alvarado-Leiton Fernanda
Ohio State University, Columbus, USA.
University of Michigan, Ann Arbor, USA.
J R Stat Soc Ser C Appl Stat. 2019 Nov;68(5):1465-1483. doi: 10.1111/rssc.12371. Epub 2019 Aug 2.
Rising costs of survey data collection and declining response rates have caused researchers to turn to non-probability samples to make descriptive statements about populations. However, unlike probability samples, non-probability samples may produce severely biased descriptive estimates due to selection bias. The paper develops and evaluates a simple model-based index of the potential selection bias in estimates of population proportions due to non-ignorable selection mechanisms. The index depends on an inestimable parameter ranging from 0 to 1 that captures the amount of deviation from selection at random and is thus well suited to a sensitivity analysis. We describe modified maximum likelihood and Bayesian estimation approaches and provide new and easy-to-use R functions for their implementation. We use simulation studies to evaluate the ability of the proposed index to reflect selection bias in non-probability samples and show how the index outperforms a previously proposed index that relies on an underlying normality assumption. We demonstrate the use of the index in practice with real data from the National Survey of Family Growth.
调查数据收集成本的不断上升和回复率的下降,促使研究人员转向非概率样本,以便对总体做出描述性陈述。然而,与概率样本不同,由于选择偏差,非概率样本可能会产生严重有偏的描述性估计。本文开发并评估了一个基于简单模型的指数,用于衡量由于不可忽略的选择机制导致的总体比例估计中的潜在选择偏差。该指数依赖于一个取值范围从0到1的不可估计参数,该参数捕捉了与随机选择的偏离程度,因此非常适合进行敏感性分析。我们描述了修正的最大似然估计和贝叶斯估计方法,并提供了新的、易于使用的R函数来实现这些方法。我们通过模拟研究来评估所提出的指数反映非概率样本中选择偏差的能力,并展示该指数如何优于先前提出的依赖于潜在正态性假设的指数。我们使用来自全国家庭成长调查的真实数据,在实际应用中展示了该指数的使用。