Janes Holly, Pepe Margaret S
Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 N. Wolfe St., Baltimore, MD 21205, USA.
Biometrics. 2008 Mar;64(1):1-9. doi: 10.1111/j.1541-0420.2007.00823.x. Epub 2007 May 14.
In case-control studies evaluating the classification accuracy of a marker, controls are often matched to cases with respect to factors associated with the marker and disease status. In contrast with matching in epidemiologic etiology studies, matching in the classification setting has not been rigorously studied. In this article, we consider the implications of matching in terms of the choice of statistical analysis, efficiency, and assessment of the incremental value of the marker over the matching covariates. We find that adjustment for the matching covariates is essential, as unadjusted summaries of classification accuracy can be biased. In many settings, matching is the most efficient covariate-dependent sampling scheme, and we provide an expression for the optimal matching ratio. However, we also show that matching greatly complicates estimation of the incremental value of the marker. We recommend that matching be carefully considered in the context of these findings.
在评估标志物分类准确性的病例对照研究中,对照通常会在与标志物及疾病状态相关的因素方面与病例进行匹配。与流行病学病因学研究中的匹配不同,分类环境中的匹配尚未得到严格研究。在本文中,我们从统计分析的选择、效率以及标志物相对于匹配协变量的增量价值评估等方面考虑匹配的影响。我们发现对匹配协变量进行调整至关重要,因为未经调整的分类准确性汇总可能存在偏差。在许多情况下,匹配是最有效的依赖协变量的抽样方案,并且我们给出了最优匹配比的表达式。然而我们也表明,匹配极大地复杂化了标志物增量价值的估计。我们建议在这些研究结果的背景下仔细考虑匹配问题。