Department of Mathematics, Hope College, Holland, Michigan, United States of America.
PLoS One. 2012;7(2):e32058. doi: 10.1371/journal.pone.0032058. Epub 2012 Feb 13.
Typically, a two-phase (double) sampling strategy is employed when classifications are subject to error and there is a gold standard (perfect) classifier available. Two-phase sampling involves classifying the entire sample with an imperfect classifier, and a subset of the sample with the gold-standard.
METHODOLOGY/PRINCIPAL FINDINGS: In this paper we consider an alternative strategy termed reclassification sampling, which involves classifying individuals using the imperfect classifier more than one time. Estimates of sensitivity, specificity and prevalence are provided for reclassification sampling, when either one or two binary classifications of each individual using the imperfect classifier are available. Robustness of estimates and design decisions to model assumptions are considered. Software is provided to compute estimates and provide advice on the optimal sampling strategy.
CONCLUSIONS/SIGNIFICANCE: Reclassification sampling is shown to be cost-effective (lower standard error of estimates for the same cost) for estimating prevalence as compared to two-phase sampling in many practical situations.
当分类存在误差且存在完美(黄金)分类器时,通常采用两阶段(双重)采样策略。两阶段采样涉及使用不完美的分类器对整个样本进行分类,以及使用黄金标准对样本的子集进行分类。
方法/主要发现:在本文中,我们考虑了一种替代策略,称为重新分类采样,它涉及使用不完美的分类器对个体进行多次分类。当使用不完美的分类器对每个个体进行一次或两次二进制分类时,我们提供了重新分类采样的敏感性、特异性和患病率估计值。还考虑了对模型假设的稳健性和设计决策。提供了软件来计算估计值,并就最佳采样策略提供建议。
结论/意义:与两阶段采样相比,在许多实际情况下,重新分类采样在估计患病率方面更具成本效益(相同成本下估计值的标准误差更低)。