Salganik Matthew J
Department of Sociology, 1180 Amsterdam Avenue, New York, NY 10027, USA.
J Urban Health. 2006 Nov;83(6 Suppl):i98-112. doi: 10.1007/s11524-006-9106-x.
Hidden populations, such as injection drug users and sex workers, are central to a number of public health problems. However, because of the nature of these groups, it is difficult to collect accurate information about them, and this difficulty complicates disease prevention efforts. A recently developed statistical approach called respondent-driven sampling improves our ability to study hidden populations by allowing researchers to make unbiased estimates of the prevalence of certain traits in these populations. Yet, not enough is known about the sample-to-sample variability of these prevalence estimates. In this paper, we present a bootstrap method for constructing confidence intervals around respondent-driven sampling estimates and demonstrate in simulations that it outperforms the naive method currently in use. We also use simulations and real data to estimate the design effects for respondent-driven sampling in a number of situations. We conclude with practical advice about the power calculations that are needed to determine the appropriate sample size for a study using respondent-driven sampling. In general, we recommend a sample size twice as large as would be needed under simple random sampling.
诸如注射吸毒者和性工作者等隐蔽人群是许多公共卫生问题的核心。然而,由于这些群体的性质,很难收集到关于他们的准确信息,而这种困难使疾病预防工作变得复杂。一种最近开发的称为应答者驱动抽样的统计方法,通过允许研究人员对这些人群中某些特征的流行率进行无偏估计,提高了我们研究隐蔽人群的能力。然而,对于这些流行率估计值在样本间的变异性,我们了解得还不够。在本文中,我们提出了一种用于围绕应答者驱动抽样估计构建置信区间的自助法,并在模拟中证明它优于目前使用的简单方法。我们还使用模拟和实际数据来估计在多种情况下应答者驱动抽样的设计效应。我们最后给出了关于功效计算的实用建议,这些计算对于确定使用应答者驱动抽样的研究的合适样本量是必要的。一般来说,我们建议样本量是简单随机抽样所需样本量的两倍。