ROC 曲线下面积是否可作为筛查或诊断试验性能的有效衡量指标？

Is the area under an ROC curve a valid measure of the performance of a screening or diagnostic test?

机构信息

Wolfson Institute of Preventive Medicine, Barts and the London School of Medicine and Dentistry, Charterhouse Square, EC1M 6BQ, London.

出版信息

J Med Screen. 2014 Mar;21(1):51-6. doi: 10.1177/0969141313517497. Epub 2014 Jan 9.

DOI:10.1177/0969141313517497

PMID:24407586

Abstract

OBJECTIVES

The area under a receiver operating characteristic (ROC) curve (the AUC) is used as a measure of the performance of a screening or diagnostic test. We here assess the validity of the AUC.

METHODS

Assuming the test results follow Gaussian distributions in affected and unaffected individuals, standard mathematical formulae were used to describe the relationship between the detection rate (DR) (or sensitivity) and the false-positive rate (FPR) of a test with the AUC. These formulae were used to calculate the screening performance (DR for a given FPR, or FPR for a given DR) for different AUC values according to different standard deviations of the test result in affected and unaffected individuals.

RESULTS

The DR for a given FPR is strongly dependent on relative differences in the standard deviation of the test variable in affected and unaffected individuals. Consequently, two tests with the same AUC can have a different DR for the same FPR. For example, an AUC of 0.75 has a DR of 24% for a 5% FPR if the standard deviations are the same in affected and unaffected individuals, but 39% for the same 5% FPR if the standard deviation in affected individuals is 1.5 times that in unaffected individuals.

CONCLUSION

The AUC is an unreliable measure of screening performance because in practice the standard deviation of a screening or diagnostic test in affected and unaffected individuals can differ. The problem is avoided by not using AUC at all, and instead specifying DRs for given FPRs or FPRs for given DRs.

摘要

目的

受试者工作特征（ROC）曲线下面积（AUC）用于衡量筛查或诊断测试的性能。我们在此评估 AUC 的有效性。

方法

假设受检者和未受检者的测试结果呈正态分布，我们使用标准数学公式描述测试的检出率（DR）（或敏感性）与假阳性率（FPR）和 AUC 之间的关系。这些公式用于根据受检者和未受检者的测试结果标准差的不同，计算不同 AUC 值的筛查性能（给定 FPR 的 DR，或给定 DR 的 FPR）。

结果

给定 FPR 的 DR 强烈依赖于受检者和未受检者的测试变量标准差的相对差异。因此，具有相同 AUC 的两个测试可能具有相同 FPR 的不同 DR。例如，如果受检者和未受检者的标准偏差相同，那么 AUC 为 0.75 的测试在 FPR 为 5%时的 DR 为 24%，但如果受检者的标准偏差是未受检者的 1.5 倍，则相同 FPR 的 DR 为 39%。

结论

AUC 是一种不可靠的筛查性能衡量指标，因为在实践中，受检者和未受检者的筛查或诊断测试的标准偏差可能不同。通过完全不使用 AUC 并指定给定 FPR 的 DR 或给定 DR 的 FPR 来避免该问题。

相似文献

Is the area under an ROC curve a valid measure of the performance of a screening or diagnostic test?

J Med Screen. 2014 Mar;21(1):51-6. doi: 10.1177/0969141313517497. Epub 2014 Jan 9.

The partial area under the summary ROC curve.

Stat Med. 2005 Jul 15;24(13):2025-40. doi: 10.1002/sim.2103.

Smooth non-parametric receiver operating characteristic (ROC) curves for continuous diagnostic tests.

Stat Med. 1997 Oct 15;16(19):2143-56. doi: 10.1002/(sici)1097-0258(19971015)16:19<2143::aid-sim655>3.0.co;2-3.

A new parametric method based on S-distributions for computing receiver operating characteristic curves for continuous diagnostic tests.

Stat Med. 2002 May 15;21(9):1213-35. doi: 10.1002/sim.1086.

Diagnostic performance of a prototype dual-energy chest imaging system ROC analysis.

Acad Radiol. 2010 Mar;17(3):298-308. doi: 10.1016/j.acra.2009.10.012. Epub 2009 Dec 30.

Diagnostic test accuracy of nutritional tools used to identify undernutrition in patients with colorectal cancer: a systematic review.

JBI Database System Rev Implement Rep. 2015 May 15;13(4):141-87. doi: 10.11124/jbisrir-2015-1673.

Disadvantages of using the area under the receiver operating characteristic curve to assess imaging tests: a discussion and proposal for an alternative approach.

Eur Radiol. 2015 Apr;25(4):932-9. doi: 10.1007/s00330-014-3487-0. Epub 2015 Jan 20.

[Analysis of a diagnostic test: ROC curve or "receiver operating characteristic"].

Rev Mal Respir. 2004 Apr;21(2 Pt 1):398-401. doi: 10.1016/s0761-8425(04)71302-9.

Beyond Trisomy 21: Additional Chromosomal Anomalies Detected through Routine Aneuploidy Screening.

J Clin Med. 2014 Apr 8;3(2):388-415. doi: 10.3390/jcm3020388.

The agreement chart as an alternative to the receiver-operating characteristic curve for diagnostic tests.

J Clin Epidemiol. 2008 Sep;61(9):866-74. doi: 10.1016/j.jclinepi.2008.04.002.

引用本文的文献

ENaC Biomarker Detection in Platelets Using a Lateral Flow Immunoassay: A Clinical Validation Study.

Biosensors (Basel). 2025 Jun 20;15(7):399. doi: 10.3390/bios15070399.

Evaluating the three-level approach of the U-smile method for imbalanced binary classification.

PLoS One. 2025 Apr 10;20(4):e0321661. doi: 10.1371/journal.pone.0321661. eCollection 2025.

Area under the ROC Curve has the most consistent evaluation for binary classification.

PLoS One. 2024 Dec 23;19(12):e0316019. doi: 10.1371/journal.pone.0316019. eCollection 2024.

Acute Effects of Dietary Protein Consumption on the Postprandial Metabolic Response, Amino Acid Levels and Circulating MicroRNAs in Patients with Obesity and Insulin Resistance.

Int J Mol Sci. 2024 Jul 14;25(14):7716. doi: 10.3390/ijms25147716.

Dutch Translation of the Yost Self-Report Lower Extremity Lymphedema Screening Questionnaire in Women.

Cancers (Basel). 2024 Jun 28;16(13):2396. doi: 10.3390/cancers16132396.

Response: Commentary: Modeling mortality risk in patients with severe COVID-19 from Mexico.

Front Med (Lausanne). 2023 Nov 21;10:1301349. doi: 10.3389/fmed.2023.1301349. eCollection 2023.

Performance of polygenic risk scores in screening, prediction, and risk stratification: secondary analysis of data in the Polygenic Score Catalog.

BMJ Med. 2023 Oct 17;2(1):e000554. doi: 10.1136/bmjmed-2023-000554. eCollection 2023.

Modeling mortality risk in patients with severe COVID-19 from Mexico.

Front Med (Lausanne). 2023 May 26;10:1187288. doi: 10.3389/fmed.2023.1187288. eCollection 2023.

The Matthews correlation coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification.

BioData Min. 2023 Feb 17;16(1):4. doi: 10.1186/s13040-023-00322-4.

Ten quick tips for computational analysis of medical images.

PLoS Comput Biol. 2023 Jan 5;19(1):e1010778. doi: 10.1371/journal.pcbi.1010778. eCollection 2023 Jan.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

ROC 曲线下面积是否可作为筛查或诊断试验性能的有效衡量指标？

Is the area under an ROC curve a valid measure of the performance of a screening or diagnostic test?

机构信息

Wolfson Institute of Preventive Medicine, Barts and the London School of Medicine and Dentistry, Charterhouse Square, EC1M 6BQ, London.

出版信息

J Med Screen. 2014 Mar;21(1):51-6. doi: 10.1177/0969141313517497. Epub 2014 Jan 9.

DOI:10.1177/0969141313517497

PMID:24407586

Abstract

OBJECTIVES

The area under a receiver operating characteristic (ROC) curve (the AUC) is used as a measure of the performance of a screening or diagnostic test. We here assess the validity of the AUC.

METHODS

RESULTS

CONCLUSION

摘要

目的

受试者工作特征（ROC）曲线下面积（AUC）用于衡量筛查或诊断测试的性能。我们在此评估 AUC 的有效性。

ROC 曲线下面积是否可作为筛查或诊断试验性能的有效衡量指标？

Is the area under an ROC curve a valid measure of the performance of a screening or diagnostic test?

机构信息

出版信息

OBJECTIVES

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

ROC 曲线下面积是否可作为筛查或诊断试验性能的有效衡量指标？

Is the area under an ROC curve a valid measure of the performance of a screening or diagnostic test?

机构信息

出版信息

OBJECTIVES

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献