• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

二元预测指标的ROC和AUC:一个可能产生误导的指标。

ROC and AUC with a Binary Predictor: a Potentially Misleading Metric.

作者信息

Muschelli John

机构信息

Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 N Wolfe St, Baltimore, MD 21205.

出版信息

J Classif. 2020 Oct;37(3):696-708. doi: 10.1007/s00357-019-09345-1. Epub 2019 Dec 23.

DOI:10.1007/s00357-019-09345-1
PMID:33250548
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7695228/
Abstract

In analysis of binary outcomes, the receiver operator characteristic (ROC) curve is heavily used to show the performance of a model or algorithm. The ROC curve is informative about the performance over a series of thresholds and can be summarized by the area under the curve (AUC), a single number. When a is categorical, the ROC curve has one less than number of categories as potential thresholds; when the predictor is binary there is only one threshold. As the AUC may be used in decision-making processes on determining the best model, it important to discuss how it agrees with the intuition from the ROC curve. We discuss how the interpolation of the curve between thresholds with binary predictors can largely change the AUC. Overall, we show using a linear interpolation from the ROC curve with binary predictors corresponds to the estimated AUC, which is most commonly done in software, which we believe can lead to misleading results. We compare R, Python, Stata, and SAS software implementations. We recommend using reporting the interpolation used and discuss the merit of using the step function interpolator, also referred to as the "pessimistic" approach by Fawcett (2006).

摘要

在二元结果分析中,接收者操作特征(ROC)曲线被大量用于展示模型或算法的性能。ROC曲线能反映在一系列阈值下的性能情况,并且可以用曲线下面积(AUC)这个单一数值来概括。当结果是分类变量时,ROC曲线的潜在阈值数量比类别数少一个;当预测变量是二元变量时,只有一个阈值。由于AUC可用于确定最佳模型的决策过程,因此讨论它如何与ROC曲线的直观表现相符很重要。我们讨论了使用二元预测变量时,阈值之间曲线的插值如何能极大地改变AUC。总体而言,我们表明使用二元预测变量的ROC曲线进行线性插值与估计的AUC相对应,这在软件中是最常见的做法,我们认为这可能会导致误导性结果。我们比较了R、Python、Stata和SAS软件的实现。我们建议报告所使用的插值方法,并讨论使用阶跃函数插值器的优点,Fawcett(2006)也将其称为“悲观”方法。

相似文献

1
ROC and AUC with a Binary Predictor: a Potentially Misleading Metric.二元预测指标的ROC和AUC:一个可能产生误导的指标。
J Classif. 2020 Oct;37(3):696-708. doi: 10.1007/s00357-019-09345-1. Epub 2019 Dec 23.
2
A Modified AUC for Training Convolutional Neural Networks: Taking Confidence Into Account.一种用于训练卷积神经网络的改进AUC:考虑置信度
Front Artif Intell. 2021 Nov 30;4:582928. doi: 10.3389/frai.2021.582928. eCollection 2021.
3
Receiver operating characteristic plot and area under the curve with binary classifiers: pragmatic analysis of cognitive screening instruments.接收者操作特征曲线和二元分类器下的曲线下面积:认知筛选工具的实用分析。
Neurodegener Dis Manag. 2021 Oct;11(5):353-360. doi: 10.2217/nmt-2021-0013. Epub 2021 Sep 27.
4
A new concordant partial AUC and partial c statistic for imbalanced data in the evaluation of machine learning algorithms.不平衡数据中机器学习算法评估的新一致性部分 AUC 和部分 c 统计量。
BMC Med Inform Decis Mak. 2020 Jan 6;20(1):4. doi: 10.1186/s12911-019-1014-6.
5
The Matthews correlation coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification.马修斯相关系数(MCC)应取代受试者工作特征曲线下面积(ROC AUC),作为评估二元分类的标准指标。
BioData Min. 2023 Feb 17;16(1):4. doi: 10.1186/s13040-023-00322-4.
6
Measuring agreement between rating interpretations and binary clinical interpretations of images: a simulation study of methods for quantifying the clinical relevance of an observer performance paradigm.评价解读与图像二分类临床解读之间的一致性测量:一种用于量化观察者性能范式临床相关性的方法的模拟研究。
Phys Med Biol. 2012 May 21;57(10):2873-904. doi: 10.1088/0031-9155/57/10/2873. Epub 2012 Apr 20.
7
Use of outcomes to evaluate surveillance systems for bioterrorist attacks.利用结果评估针对生物恐怖袭击的监测系统。
BMC Med Inform Decis Mak. 2010 May 7;10:25. doi: 10.1186/1472-6947-10-25.
8
Small-sample precision of ROC-related estimates.ROC 相关估计的小样本精度。
Bioinformatics. 2010 Mar 15;26(6):822-30. doi: 10.1093/bioinformatics/btq037. Epub 2010 Feb 3.
9
Receiver operating characteristic curve in diagnostic test assessment.诊断测试评估中的受试者工作特征曲线。
J Thorac Oncol. 2010 Sep;5(9):1315-6. doi: 10.1097/JTO.0b013e3181ec173d.
10
Receiver operating characteristic analysis under tree orderings of disease classes.疾病类别树形排序下的受试者工作特征分析。
Stat Med. 2016 May 20;35(11):1907-26. doi: 10.1002/sim.6843. Epub 2015 Dec 17.

引用本文的文献

1
Enhancing anomaly detection in plant disease recognition with knowledge ensemble.利用知识集成增强植物病害识别中的异常检测。
Front Plant Sci. 2025 Aug 15;16:1623907. doi: 10.3389/fpls.2025.1623907. eCollection 2025.
2
An interpretable machine learning model for predicting myocardial injury in patients with high cervical spinal cord injury.一种用于预测高位颈脊髓损伤患者心肌损伤的可解释机器学习模型。
Front Genet. 2025 Aug 4;16:1636065. doi: 10.3389/fgene.2025.1636065. eCollection 2025.
3
Socioeconomic status and lifestyle as factors of multimorbidity among older adults in China: results from the China Health and Retirement Longitudinal Survey.社会经济地位和生活方式作为中国老年人多种疾病并存的因素:来自中国健康与养老追踪调查的结果
Front Public Health. 2025 Jul 30;13:1586091. doi: 10.3389/fpubh.2025.1586091. eCollection 2025.
4
HEADS-ED Under 6: A clinician-administered mental health and developmental screening and triage tool.6岁以下儿童心理健康与发育筛查及分诊工具(HEADS-ED):一种由临床医生实施的工具
Paediatr Child Health. 2025 Apr 3;30(4):312-319. doi: 10.1093/pch/pxae103. eCollection 2025 Jul.
5
Construction and validation of a machine learning model to predict the risk of nasopharyngeal carcinoma using multimodal clinical data: a single-center, retrospective study.利用多模态临床数据构建和验证预测鼻咽癌风险的机器学习模型:一项单中心回顾性研究
Clin Transl Oncol. 2025 Jul 15. doi: 10.1007/s12094-025-03992-0.
6
Comprehensive protein datasets and benchmarking for liquid-liquid phase separation studies.用于液-液相分离研究的综合蛋白质数据集及基准测试
Genome Biol. 2025 Jul 8;26(1):198. doi: 10.1186/s13059-025-03668-6.
7
Alpha-Synuclein Seed Amplification Assays in Parkinson's Disease: A Systematic Review and Network Meta-Analysis.帕金森病中的α-突触核蛋白种子扩增检测:系统评价与网状Meta分析
Clin Pract. 2025 Jun 3;15(6):107. doi: 10.3390/clinpract15060107.
8
Bayesian Model Prediction for Breast Cancer Survival: A Retrospective Analysis.乳腺癌生存的贝叶斯模型预测:一项回顾性分析。
Eur J Breast Health. 2025 Jun 20;21(3):255-264. doi: 10.4274/ejbh.galenos.2025.2025-2-14. Epub 2025 May 27.
9
Identifying the DNA methylation preference of transcription factors using ProtBERT and SVM.使用ProtBERT和支持向量机识别转录因子的DNA甲基化偏好性。
PLoS Comput Biol. 2025 May 13;21(5):e1012513. doi: 10.1371/journal.pcbi.1012513. eCollection 2025 May.
10
Grading Scores for Identifying Patients at Risk of Delayed Cerebral Ischemia and Neurological Outcome in Spontaneous Subarachnoid Hemorrhage: A Comparison of Receiver Operator Curve Analysis.用于识别自发性蛛网膜下腔出血患者发生迟发性脑缺血风险及神经功能结局的分级评分:受试者工作特征曲线分析的比较
Neurocrit Care. 2025 Apr 28. doi: 10.1007/s12028-025-02270-9.

本文引用的文献

1
Bayesian Multi-Plate High-Throughput Screening of Compounds.贝叶斯多板高通量化合物筛选。
Sci Rep. 2018 Jun 22;8(1):9551. doi: 10.1038/s41598-018-27531-w.
2
Deep learning improves antimicrobial peptide recognition.深度学习提高抗菌肽识别能力。
Bioinformatics. 2018 Aug 15;34(16):2740-2747. doi: 10.1093/bioinformatics/bty179.
3
Diagnostic Criteria of Ulcerative Pyoderma Gangrenosum: A Delphi Consensus of International Experts.溃疡性坏疽性脓皮病的诊断标准:国际专家德尔菲共识。
JAMA Dermatol. 2018 Apr 1;154(4):461-466. doi: 10.1001/jamadermatol.2017.5980.
4
Comparison of Swirl Sign and Black Hole Sign in Predicting Early Hematoma Growth in Patients with Spontaneous Intracerebral Hemorrhage.比较自发性脑出血患者的漩涡征和黑洞征对预测早期血肿增长的作用。
Med Sci Monit. 2018 Jan 29;24:567-573. doi: 10.12659/msm.906708.
5
Degree of mosaicism in trophectoderm does not predict pregnancy potential: a corrected analysis of pregnancy outcomes following transfer of mosaic embryos.滋养层嵌合度不能预测妊娠潜能:校正分析后行嵌合体胚胎移植的妊娠结局
Reprod Biol Endocrinol. 2018 Jan 26;16(1):6. doi: 10.1186/s12958-018-0322-5.
6
The Parasternal Short-Axis View Improves Diagnostic Accuracy for Inferior Sinus Venosus Type of Atrial Septal Defects by Transthoracic Echocardiography.经胸超声心动图的胸骨旁短轴视图提高了下腔型房间隔缺损的诊断准确性。
J Am Soc Echocardiogr. 2017 Mar;30(3):209-215. doi: 10.1016/j.echo.2016.12.007. Epub 2017 Jan 27.
7
Factors associated with significant MRI findings in medical walk-in patients with acute headache.急性头痛的非预约就诊患者中与MRI显著发现相关的因素。
Swiss Med Wkly. 2016 Dec 8;146:w14349. doi: 10.4414/smw.2016.14349. eCollection 2016.
8
A Revised Approach for the Detection of Sight-Threatening Diabetic Macular Edema.一种用于检测威胁视力的糖尿病性黄斑水肿的改良方法。
JAMA Ophthalmol. 2017 Jan 1;135(1):62-68. doi: 10.1001/jamaophthalmol.2016.4772.
9
Technology and the Glaucoma Suspect.技术与青光眼可疑患者
Invest Ophthalmol Vis Sci. 2016 Jul 1;57(9):OCT80-5. doi: 10.1167/iovs.15-18931.
10
Durability of the balloon-expandable covered versus bare-metal stents in the Covered versus Balloon Expandable Stent Trial (COBEST) for the treatment of aortoiliac occlusive disease.在覆膜与球囊扩张支架治疗主髂动脉闭塞性疾病试验(COBEST)中,球囊扩张覆膜支架与裸金属支架的耐用性。
J Vasc Surg. 2016 Jul;64(1):83-94.e1. doi: 10.1016/j.jvs.2016.02.064. Epub 2016 Apr 28.