Wolfson Institute of Preventive Medicine, Barts and the London School of Medicine and Dentistry, Charterhouse Square, EC1M 6BQ, London.
J Med Screen. 2014 Mar;21(1):51-6. doi: 10.1177/0969141313517497. Epub 2014 Jan 9.
The area under a receiver operating characteristic (ROC) curve (the AUC) is used as a measure of the performance of a screening or diagnostic test. We here assess the validity of the AUC.
Assuming the test results follow Gaussian distributions in affected and unaffected individuals, standard mathematical formulae were used to describe the relationship between the detection rate (DR) (or sensitivity) and the false-positive rate (FPR) of a test with the AUC. These formulae were used to calculate the screening performance (DR for a given FPR, or FPR for a given DR) for different AUC values according to different standard deviations of the test result in affected and unaffected individuals.
The DR for a given FPR is strongly dependent on relative differences in the standard deviation of the test variable in affected and unaffected individuals. Consequently, two tests with the same AUC can have a different DR for the same FPR. For example, an AUC of 0.75 has a DR of 24% for a 5% FPR if the standard deviations are the same in affected and unaffected individuals, but 39% for the same 5% FPR if the standard deviation in affected individuals is 1.5 times that in unaffected individuals.
The AUC is an unreliable measure of screening performance because in practice the standard deviation of a screening or diagnostic test in affected and unaffected individuals can differ. The problem is avoided by not using AUC at all, and instead specifying DRs for given FPRs or FPRs for given DRs.
受试者工作特征(ROC)曲线下面积(AUC)用于衡量筛查或诊断测试的性能。我们在此评估 AUC 的有效性。
假设受检者和未受检者的测试结果呈正态分布,我们使用标准数学公式描述测试的检出率(DR)(或敏感性)与假阳性率(FPR)和 AUC 之间的关系。这些公式用于根据受检者和未受检者的测试结果标准差的不同,计算不同 AUC 值的筛查性能(给定 FPR 的 DR,或给定 DR 的 FPR)。
给定 FPR 的 DR 强烈依赖于受检者和未受检者的测试变量标准差的相对差异。因此,具有相同 AUC 的两个测试可能具有相同 FPR 的不同 DR。例如,如果受检者和未受检者的标准偏差相同,那么 AUC 为 0.75 的测试在 FPR 为 5%时的 DR 为 24%,但如果受检者的标准偏差是未受检者的 1.5 倍,则相同 FPR 的 DR 为 39%。
AUC 是一种不可靠的筛查性能衡量指标,因为在实践中,受检者和未受检者的筛查或诊断测试的标准偏差可能不同。通过完全不使用 AUC 并指定给定 FPR 的 DR 或给定 DR 的 FPR 来避免该问题。