Janes Holly, Longton Gary, Pepe Margaret
Fred Hutchinson Cancer Research Center, Seattle, Washington, USA.
Stata J. 2009 Jan 1;9(1):17-39.
Classification accuracy is the ability of a marker or diagnostic test to discriminate between two groups of individuals, cases and controls, and is commonly summarized using the receiver operating characteristic (ROC) curve. In studies of classification accuracy, there are often covariates that should be incorporated into the ROC analysis. We describe three different ways of using covariate information. For factors that affect marker observations among controls, we present a method for covariate adjustment. For factors that affect discrimination (i.e. the ROC curve), we describe methods for modelling the ROC curve as a function of covariates. Finally, for factors that contribute to discrimination, we propose combining the marker and covariate information, and ask how much discriminatory accuracy improves with the addition of the marker to the covariates (incremental value). These methods follow naturally when representing the ROC curve as a summary of the distribution of case marker observations, standardized with respect to the control distribution.
分类准确性是指一个标志物或诊断测试区分两组个体(病例组和对照组)的能力,通常使用受试者工作特征(ROC)曲线进行总结。在分类准确性研究中,常常存在一些协变量,应将其纳入ROC分析。我们描述了三种使用协变量信息的不同方法。对于影响对照组中标志物观察值的因素,我们提出了一种协变量调整方法。对于影响区分度(即ROC曲线)的因素,我们描述了将ROC曲线建模为协变量函数的方法。最后,对于有助于区分度的因素,我们建议将标志物信息和协变量信息结合起来,并探讨在协变量中加入标志物后,区分准确性提高了多少(增加值)。当将ROC曲线表示为病例标志物观察值分布的总结,并相对于对照分布进行标准化时,这些方法自然而然地就出现了。