Aubry Philippe
OFB - Office français de la biodiversité - Direction surveillance, évaluation, données - Unité données et appui méthodologique, Saint Benoist, BP 20, F-78612 Le Perray-en-Yvelines, France.
MethodsX. 2024 Sep 3;13:102928. doi: 10.1016/j.mex.2024.102928. eCollection 2024 Dec.
Two-stage stratified sampling is a complex design that involves nested sampling units and stratification. This complexity increases when the strata have too few sampled units for variance estimation, necessitating the use of , where multiple strata are combined to ensure an adequate sample size. When collapsing strata, two cases can be distinguished depending on whether a size variable associated with the variable of interest is available at the stratum level.•We present computer-implementable formulas for total, mean, and ratio estimators, along with their corresponding sampling variance estimators, for stratified two-stage simple random sampling without replacement, and we provide ready-to-use algorithms.•We introduce two methods for grouping strata: (1) a deterministic approach that uses stratum codes to define an ordinal variable, which orders the strata, and (2) a stochastic method that aims to minimize within-group inertia, which measures the heterogeneity within the newly formed groups of strata.•We emphasize that, unlike the correlation between a size variable and the variable of interest at the stratum level, the bias of the sampling variance estimator for the collapsed strata technique is not invariant to linear transformations. It follows that a high correlation does not ensure a low-bias estimator of the sampling variance.
两阶段分层抽样是一种复杂的设计,涉及嵌套抽样单元和分层。当各层用于方差估计的抽样单元过少时,这种复杂性会增加,这就需要使用合并多个层以确保有足够样本量的方法。在合并层时,根据与感兴趣变量相关的规模变量在层级别是否可用,可以区分两种情况。
• 我们给出了无放回分层两阶段简单随机抽样的总体、均值和比率估计量的计算机可实现公式,以及它们相应的抽样方差估计量,并提供了现成可用的算法。
• 我们介绍了两种层分组方法:(1)一种确定性方法,使用层代码定义一个排序变量,对层进行排序;(2)一种随机方法,旨在最小化组内惯性,组内惯性衡量新形成的层组内的异质性。
• 我们强调,与层级别上规模变量和感兴趣变量之间的相关性不同,合并层技术的抽样方差估计量的偏差对于线性变换不是不变的。因此,高相关性并不能确保抽样方差的低偏差估计量。