Janes Holly, Pepe Margaret
Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 North Wolfe Street, Baltimore, MD 21205, USA.
Biostatistics. 2006 Jul;7(3):456-68. doi: 10.1093/biostatistics/kxj018. Epub 2006 Jan 20.
The case-control design is frequently used to study the discriminatory accuracy of a screening or diagnostic biomarker. Yet, the appropriate ratio in which to sample cases and controls has never been determined. It is common for researchers to sample equal numbers of cases and controls, a strategy that can be optimal for studies of association. However, considerations are quite different when the biomarker is to be used for classification. In this paper, we provide an expression for the optimal case-control ratio, when the accuracy of the biomarker is quantified by the receiver operating characteristic (ROC) curve. We show how it can be integrated with choosing the overall sample size to yield an efficient study design with specified power and type-I error. We also derive the optimal case-control ratios for estimating the area under the ROC curve and the area under part of the ROC curve. Our methods are applied to a study of a new marker for adenocarcinoma in patients with Barrett's esophagus.
病例对照设计常用于研究筛查或诊断生物标志物的鉴别准确性。然而,病例和对照的合适抽样比例从未确定。研究人员通常抽取相等数量的病例和对照,这一策略对于关联性研究可能是最优的。然而,当生物标志物用于分类时,考虑因素则大不相同。在本文中,当生物标志物的准确性通过接收者操作特征(ROC)曲线进行量化时,我们给出了最优病例对照比例的表达式。我们展示了如何将其与选择总体样本量相结合,以产生具有指定功效和I型错误的高效研究设计。我们还推导了用于估计ROC曲线下面积和ROC曲线部分下面积的最优病例对照比例。我们的方法应用于一项关于巴雷特食管患者腺癌新标志物的研究。