Department of Methodology and Statistics, Care and Public Health Research Institute (CAPHRI), Maastricht University, Maastricht, the Netherlands.
Department of Methodology and Statistics, Graduate School of Psychology and Neuroscience, Maastricht University, Maastricht, the Netherlands.
Stat Methods Med Res. 2021 Feb;30(2):357-375. doi: 10.1177/0962280220952833. Epub 2020 Sep 17.
To estimate the mean of a quantitative variable in a hierarchical population, it is logistically convenient to sample in two stages (two-stage sampling), i.e. selecting first clusters, and then individuals from the sampled clusters. Allowing cluster size to vary in the population and to be related to the mean of the outcome variable of interest (informative cluster size), the following competing sampling designs are considered: sampling clusters with probability proportional to cluster size, and then the same number of individuals per cluster; drawing clusters with equal probability, and then the same percentage of individuals per cluster; and selecting clusters with equal probability, and then the same number of individuals per cluster. For each design, optimal sample sizes are derived under a budget constraint. The three optimal two-stage sampling designs are compared, in terms of efficiency, with each other and with simple random sampling of individuals. Sampling clusters with probability proportional to size is recommended. To overcome the dependency of the optimal design on unknown nuisance parameters, maximin designs are derived. The results are illustrated, assuming probability proportional to size sampling of clusters, with the planning of a hypothetical survey to compare adolescent alcohol consumption between France and Italy.
为了估计分层总体中定量变量的均值,在两阶段抽样(two-stage sampling)中进行抽样在逻辑上是方便的,即首先选择聚类,然后从抽样的聚类中选择个体。允许总体中的聚类大小变化,并与感兴趣的结果变量的均值相关(信息聚类大小),考虑了以下具有竞争力的抽样设计:以与聚类大小成比例的概率抽样聚类,然后每个聚类中具有相同数量的个体;以相等的概率抽取聚类,然后每个聚类中具有相同的个体百分比;选择具有相等概率的聚类,然后每个聚类中具有相同数量的个体。对于每种设计,在预算约束下推导出最优样本量。根据效率,最优两阶段抽样设计彼此进行比较,并与个体的简单随机抽样进行比较。建议以与大小成比例的概率抽样聚类。为了克服最优设计对未知干扰参数的依赖性,推导出了最大最小设计。以假设的以大小成比例抽样聚类的规划为例,说明了这些结果,该规划用于比较法国和意大利青少年的酒精消费。