Department of Infectious Disease Epidemiology, Faculty of Epidemiology & Population Health, London School of Hygiene and Tropical Medicine, UK.
Epidemiology. 2012 Jan;23(1):138-47. doi: 10.1097/EDE.0b013e31823ac17c.
Respondent-driven sampling is a novel variant of link-tracing sampling for estimating the characteristics of hard-to-reach groups, such as HIV prevalence in sex workers. Despite its use by leading health organizations, the performance of this method in realistic situations is still largely unknown. We evaluated respondent-driven sampling by comparing estimates from a respondent-driven sampling survey with total population data.
Total population data on age, tribe, religion, socioeconomic status, sexual activity, and HIV status were available on a population of 2402 male household heads from an open cohort in rural Uganda. A respondent-driven sampling (RDS) survey was carried out in this population, using current methods of sampling (RDS sample) and statistical inference (RDS estimates). Analyses were carried out for the full RDS sample and then repeated for the first 250 recruits (small sample).
We recruited 927 household heads. Full and small RDS samples were largely representative of the total population, but both samples underrepresented men who were younger, of higher socioeconomic status, and with unknown sexual activity and HIV status. Respondent-driven sampling statistical inference methods failed to reduce these biases. Only 31%-37% (depending on method and sample size) of RDS estimates were closer to the true population proportions than the RDS sample proportions. Only 50%-74% of respondent-driven sampling bootstrap 95% confidence intervals included the population proportion.
Respondent-driven sampling produced a generally representative sample of this well-connected nonhidden population. However, current respondent-driven sampling inference methods failed to reduce bias when it occurred. Whether the data required to remove bias and measure precision can be collected in a respondent-driven sampling survey is unresolved. Respondent-driven sampling should be regarded as a (potentially superior) form of convenience sampling method, and caution is required when interpreting findings based on the sampling method.
应答者驱动抽样是一种新的链接追踪抽样变体,用于估计难以接触到的群体的特征,例如性工作者中的艾滋病毒流行率。尽管领先的卫生组织已经使用了这种方法,但在实际情况下,该方法的性能仍然很大程度上未知。我们通过将应答者驱动抽样调查的估计值与总人口数据进行比较来评估应答者驱动抽样。
在乌干达农村的一个开放队列中,我们获得了关于 2402 名男性户主的年龄、部落、宗教、社会经济地位、性行为和艾滋病毒状况的总人口数据。在该人群中进行了应答者驱动抽样(RDS)调查,使用了当前的抽样方法(RDS 样本)和统计推断(RDS 估计)。我们对完整的 RDS 样本进行了分析,然后对前 250 名招募人员(小样本)进行了重复分析。
我们招募了 927 名户主。完整和小 RDS 样本在很大程度上代表了总人口,但两个样本都低估了年龄较小、社会经济地位较高、性行为和艾滋病毒状况未知的男性。应答者驱动抽样统计推断方法未能减少这些偏差。只有 31%-37%(取决于方法和样本大小)的 RDS 估计值比 RDS 样本比例更接近真实人口比例。只有 50%-74%的应答者驱动抽样自举 95%置信区间包含人口比例。
应答者驱动抽样对这个联系紧密的非隐藏人群产生了一个普遍具有代表性的样本。然而,当前的应答者驱动抽样推断方法在发生偏差时未能减少偏差。在应答者驱动抽样调查中是否可以收集到消除偏差和衡量精度所需的数据仍未解决。应答者驱动抽样应被视为一种(潜在优越)的便利抽样方法,在根据抽样方法解释发现时需要谨慎。