校准：概率模型输出的交互式分析

Calibrate: Interactive Analysis of Probabilistic Model Output.

作者信息

Xenopoulos Peter, Rulff Joao, Nonato Luis Gustavo, Barr Brian, Silva Claudio

出版信息

IEEE Trans Vis Comput Graph. 2023 Jan;29(1):853-863. doi: 10.1109/TVCG.2022.3209489. Epub 2022 Dec 16.

DOI:10.1109/TVCG.2022.3209489

Abstract

Analyzing classification model performance is a crucial task for machine learning practitioners. While practitioners often use count-based metrics derived from confusion matrices, like accuracy, many applications, such as weather prediction, sports betting, or patient risk prediction, rely on a classifier's predicted probabilities rather than predicted labels. In these instances, practitioners are concerned with producing a calibrated model, that is, one which outputs probabilities that reflect those of the true distribution. Model calibration is often analyzed visually, through static reliability diagrams, however, the traditional calibration visualization may suffer from a variety of drawbacks due to the strong aggregations it necessitates. Furthermore, count-based approaches are unable to sufficiently analyze model calibration. We present Calibrate, an interactive reliability diagram that addresses the aforementioned issues. Calibrate constructs a reliability diagram that is resistant to drawbacks in traditional approaches, and allows for interactive subgroup analysis and instance-level inspection. We demonstrate the utility of Calibrate through use cases on both real-world and synthetic data. We further validate Calibrate by presenting the results of a think-aloud experiment with data scientists who routinely analyze model calibration.

摘要

分析分类模型的性能是机器学习从业者的一项关键任务。虽然从业者通常使用从混淆矩阵得出的基于计数的指标，如准确率，但许多应用，如天气预报、体育博彩或患者风险预测，依赖于分类器预测的概率而非预测的标签。在这些情况下，从业者关心的是生成一个校准模型，即一个输出反映真实分布概率的模型。模型校准通常通过静态可靠性图进行可视化分析，然而，由于传统校准可视化需要进行强聚合，可能会存在各种缺点。此外，基于计数的方法无法充分分析模型校准。我们提出了Calibrate，一种交互式可靠性图，解决了上述问题。Calibrate构建了一个能抵御传统方法缺点的可靠性图，并允许进行交互式子组分析和实例级检查。我们通过在真实世界和合成数据上的用例展示了Calibrate的效用。我们还通过展示与经常分析模型校准的数据科学家进行的出声思考实验结果，进一步验证了Calibrate。

相似文献

Calibrate: Interactive Analysis of Probabilistic Model Output.

IEEE Trans Vis Comput Graph. 2023 Jan;29(1):853-863. doi: 10.1109/TVCG.2022.3209489. Epub 2022 Dec 16.

Stable reliability diagrams for probabilistic classifiers.

Proc Natl Acad Sci U S A. 2021 Feb 23;118(8). doi: 10.1073/pnas.2016191118.

Adjusting the outputs of a classifier to new a priori probabilities: a simple procedure.

Neural Comput. 2002 Jan;14(1):21-41. doi: 10.1162/089976602753284446.

IDA-MIL: Classification of Glomerular with Spike-like Projections via Multiple Instance Learning with Instance-level Data Augmentation.

Comput Methods Programs Biomed. 2022 Oct;225:107106. doi: 10.1016/j.cmpb.2022.107106. Epub 2022 Sep 2.

A Novel Blunge Calibration Intelligent Feature Classification Model for the Prediction of Hypothyroid Disease.

Sensors (Basel). 2023 Jan 18;23(3):1128. doi: 10.3390/s23031128.

Smooth isotonic regression: a new method to calibrate predictive models.

AMIA Jt Summits Transl Sci Proc. 2011;2011:16-20. Epub 2011 Mar 7.

Squares: Supporting Interactive Performance Analysis for Multiclass Classifiers.

IEEE Trans Vis Comput Graph. 2017 Jan;23(1):61-70. doi: 10.1109/TVCG.2016.2598828.

Interactive polar diagrams for model comparison.

Comput Methods Programs Biomed. 2023 Dec;242:107843. doi: 10.1016/j.cmpb.2023.107843. Epub 2023 Oct 6.

Calibration of Portable Particulate Matter-Monitoring Device using Web Query and Machine Learning.

Saf Health Work. 2019 Dec;10(4):452-460. doi: 10.1016/j.shaw.2019.08.002. Epub 2019 Aug 19.

Formal definition of the MARS method for quantifying the unique target class discoveries of selected machine classifiers.

F1000Res. 2022 Apr 4;11:391. doi: 10.12688/f1000research.110567.2. eCollection 2022.

引用本文的文献

Machine learning-based analysis identifies a 13-gene prognostic signature to improve the clinical outcomes of colorectal cancer.

J Gastrointest Oncol. 2024 Oct 31;15(5):2100-2116. doi: 10.21037/jgo-24-325. Epub 2024 Oct 24.

Machine-guided discovery of a real-world rogue wave model.

Proc Natl Acad Sci U S A. 2023 Nov 28;120(48):e2306275120. doi: 10.1073/pnas.2306275120. Epub 2023 Nov 20.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

校准：概率模型输出的交互式分析

Calibrate: Interactive Analysis of Probabilistic Model Output.

作者信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献