• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ROC 曲线下面积是否可作为筛查或诊断试验性能的有效衡量指标?

Is the area under an ROC curve a valid measure of the performance of a screening or diagnostic test?

机构信息

Wolfson Institute of Preventive Medicine, Barts and the London School of Medicine and Dentistry, Charterhouse Square, EC1M 6BQ, London.

出版信息

J Med Screen. 2014 Mar;21(1):51-6. doi: 10.1177/0969141313517497. Epub 2014 Jan 9.

DOI:10.1177/0969141313517497
PMID:24407586
Abstract

OBJECTIVES

The area under a receiver operating characteristic (ROC) curve (the AUC) is used as a measure of the performance of a screening or diagnostic test. We here assess the validity of the AUC.

METHODS

Assuming the test results follow Gaussian distributions in affected and unaffected individuals, standard mathematical formulae were used to describe the relationship between the detection rate (DR) (or sensitivity) and the false-positive rate (FPR) of a test with the AUC. These formulae were used to calculate the screening performance (DR for a given FPR, or FPR for a given DR) for different AUC values according to different standard deviations of the test result in affected and unaffected individuals.

RESULTS

The DR for a given FPR is strongly dependent on relative differences in the standard deviation of the test variable in affected and unaffected individuals. Consequently, two tests with the same AUC can have a different DR for the same FPR. For example, an AUC of 0.75 has a DR of 24% for a 5% FPR if the standard deviations are the same in affected and unaffected individuals, but 39% for the same 5% FPR if the standard deviation in affected individuals is 1.5 times that in unaffected individuals.

CONCLUSION

The AUC is an unreliable measure of screening performance because in practice the standard deviation of a screening or diagnostic test in affected and unaffected individuals can differ. The problem is avoided by not using AUC at all, and instead specifying DRs for given FPRs or FPRs for given DRs.

摘要

目的

受试者工作特征(ROC)曲线下面积(AUC)用于衡量筛查或诊断测试的性能。我们在此评估 AUC 的有效性。

方法

假设受检者和未受检者的测试结果呈正态分布,我们使用标准数学公式描述测试的检出率(DR)(或敏感性)与假阳性率(FPR)和 AUC 之间的关系。这些公式用于根据受检者和未受检者的测试结果标准差的不同,计算不同 AUC 值的筛查性能(给定 FPR 的 DR,或给定 DR 的 FPR)。

结果

给定 FPR 的 DR 强烈依赖于受检者和未受检者的测试变量标准差的相对差异。因此,具有相同 AUC 的两个测试可能具有相同 FPR 的不同 DR。例如,如果受检者和未受检者的标准偏差相同,那么 AUC 为 0.75 的测试在 FPR 为 5%时的 DR 为 24%,但如果受检者的标准偏差是未受检者的 1.5 倍,则相同 FPR 的 DR 为 39%。

结论

AUC 是一种不可靠的筛查性能衡量指标,因为在实践中,受检者和未受检者的筛查或诊断测试的标准偏差可能不同。通过完全不使用 AUC 并指定给定 FPR 的 DR 或给定 DR 的 FPR 来避免该问题。

相似文献

1
Is the area under an ROC curve a valid measure of the performance of a screening or diagnostic test?ROC 曲线下面积是否可作为筛查或诊断试验性能的有效衡量指标?
J Med Screen. 2014 Mar;21(1):51-6. doi: 10.1177/0969141313517497. Epub 2014 Jan 9.
2
The partial area under the summary ROC curve.汇总ROC曲线下的部分面积。
Stat Med. 2005 Jul 15;24(13):2025-40. doi: 10.1002/sim.2103.
3
Smooth non-parametric receiver operating characteristic (ROC) curves for continuous diagnostic tests.连续诊断试验的平滑非参数接收者操作特征(ROC)曲线。
Stat Med. 1997 Oct 15;16(19):2143-56. doi: 10.1002/(sici)1097-0258(19971015)16:19<2143::aid-sim655>3.0.co;2-3.
4
A new parametric method based on S-distributions for computing receiver operating characteristic curves for continuous diagnostic tests.一种基于S分布的用于连续诊断测试计算受试者工作特征曲线的新参数方法。
Stat Med. 2002 May 15;21(9):1213-35. doi: 10.1002/sim.1086.
5
Diagnostic performance of a prototype dual-energy chest imaging system ROC analysis.原型双能胸部成像系统的诊断性能 ROC 分析。
Acad Radiol. 2010 Mar;17(3):298-308. doi: 10.1016/j.acra.2009.10.012. Epub 2009 Dec 30.
6
Diagnostic test accuracy of nutritional tools used to identify undernutrition in patients with colorectal cancer: a systematic review.用于识别结直肠癌患者营养不良的营养评估工具的诊断测试准确性:一项系统综述
JBI Database System Rev Implement Rep. 2015 May 15;13(4):141-87. doi: 10.11124/jbisrir-2015-1673.
7
Disadvantages of using the area under the receiver operating characteristic curve to assess imaging tests: a discussion and proposal for an alternative approach.使用受试者工作特征曲线下面积评估成像检查的缺点:一种替代方法的讨论与建议
Eur Radiol. 2015 Apr;25(4):932-9. doi: 10.1007/s00330-014-3487-0. Epub 2015 Jan 20.
8
[Analysis of a diagnostic test: ROC curve or "receiver operating characteristic"].[诊断试验分析:ROC曲线或“接收者操作特征曲线”]
Rev Mal Respir. 2004 Apr;21(2 Pt 1):398-401. doi: 10.1016/s0761-8425(04)71302-9.
9
Beyond Trisomy 21: Additional Chromosomal Anomalies Detected through Routine Aneuploidy Screening.超越21三体综合征:通过常规非整倍体筛查检测到的其他染色体异常。
J Clin Med. 2014 Apr 8;3(2):388-415. doi: 10.3390/jcm3020388.
10
The agreement chart as an alternative to the receiver-operating characteristic curve for diagnostic tests.作为诊断试验的接收者操作特征曲线替代方法的一致性图表。
J Clin Epidemiol. 2008 Sep;61(9):866-74. doi: 10.1016/j.jclinepi.2008.04.002.

引用本文的文献

1
ENaC Biomarker Detection in Platelets Using a Lateral Flow Immunoassay: A Clinical Validation Study.使用侧向流动免疫分析法检测血小板中的ENaC生物标志物:一项临床验证研究。
Biosensors (Basel). 2025 Jun 20;15(7):399. doi: 10.3390/bios15070399.
2
Evaluating the three-level approach of the U-smile method for imbalanced binary classification.评估U-smile方法用于不平衡二元分类的三级方法。
PLoS One. 2025 Apr 10;20(4):e0321661. doi: 10.1371/journal.pone.0321661. eCollection 2025.
3
Area under the ROC Curve has the most consistent evaluation for binary classification.
受试者工作特征曲线下面积对二元分类具有最一致的评估。
PLoS One. 2024 Dec 23;19(12):e0316019. doi: 10.1371/journal.pone.0316019. eCollection 2024.
4
Acute Effects of Dietary Protein Consumption on the Postprandial Metabolic Response, Amino Acid Levels and Circulating MicroRNAs in Patients with Obesity and Insulin Resistance.膳食蛋白质摄入对肥胖和胰岛素抵抗患者餐后代谢反应、氨基酸水平和循环 microRNAs 的急性影响。
Int J Mol Sci. 2024 Jul 14;25(14):7716. doi: 10.3390/ijms25147716.
5
Dutch Translation of the Yost Self-Report Lower Extremity Lymphedema Screening Questionnaire in Women.约斯特女性下肢淋巴水肿自我报告筛查问卷的荷兰语翻译。
Cancers (Basel). 2024 Jun 28;16(13):2396. doi: 10.3390/cancers16132396.
6
Response: Commentary: Modeling mortality risk in patients with severe COVID-19 from Mexico.回应:评论:墨西哥重症新冠肺炎患者死亡风险建模
Front Med (Lausanne). 2023 Nov 21;10:1301349. doi: 10.3389/fmed.2023.1301349. eCollection 2023.
7
Performance of polygenic risk scores in screening, prediction, and risk stratification: secondary analysis of data in the Polygenic Score Catalog.多基因风险评分在筛查、预测和风险分层中的表现:多基因评分目录中数据的二次分析
BMJ Med. 2023 Oct 17;2(1):e000554. doi: 10.1136/bmjmed-2023-000554. eCollection 2023.
8
Modeling mortality risk in patients with severe COVID-19 from Mexico.墨西哥重症新冠肺炎患者死亡风险建模
Front Med (Lausanne). 2023 May 26;10:1187288. doi: 10.3389/fmed.2023.1187288. eCollection 2023.
9
The Matthews correlation coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification.马修斯相关系数(MCC)应取代受试者工作特征曲线下面积(ROC AUC),作为评估二元分类的标准指标。
BioData Min. 2023 Feb 17;16(1):4. doi: 10.1186/s13040-023-00322-4.
10
Ten quick tips for computational analysis of medical images.医学图像计算分析的十个快速技巧。
PLoS Comput Biol. 2023 Jan 5;19(1):e1010778. doi: 10.1371/journal.pcbi.1010778. eCollection 2023 Jan.