使用近等渗回归模型集成的二元分类器校准

Binary Classifier Calibration using an Ensemble of Near Isotonic Regression Models.

作者信息

Naeini Mahdi Pakdaman, Cooper Gregory F

机构信息

Intelligent Systems Program, University of Pittsburgh, Pittsburgh, USA.

Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, USA.

出版信息

Proc IEEE Int Conf Data Min. 2016 Dec;2016:360-369. doi: 10.1109/ICDM.2016.0047. Epub 2017 Feb 2.

DOI:10.1109/ICDM.2016.0047

PMID:28316511

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5351887/

Abstract

Learning accurate probabilistic models from data is crucial in many practical tasks in data mining. In this paper we present a new non-parametric calibration method called (ENIR). The method can be considered as an extension of BBQ [20], a recently proposed calibration method, as well as the commonly used calibration method based on isotonic regression (IsoRegC) [27]. ENIR is designed to address the key limitation of IsoRegC which is the monotonicity assumption of the predictions. Similar to BBQ, the method post-processes the output of a binary classifier to obtain calibrated probabilities. Thus it can be used with many existing classification models to generate accurate probabilistic predictions. We demonstrate the performance of ENIR on synthetic and real datasets for commonly applied binary classification models. Experimental results show that the method outperforms several common binary classifier calibration methods. In particular on the real data, ENIR commonly performs statistically significantly better than the other methods, and never worse. It is able to improve the calibration power of classifiers, while retaining their discrimination power. The method is also computationally tractable for large scale datasets, as it is ( log ) time, where is the number of samples.

摘要

从数据中学习准确的概率模型在数据挖掘的许多实际任务中至关重要。在本文中，我们提出了一种名为（ENIR）的新的非参数校准方法。该方法可被视为最近提出的校准方法BBQ [20]以及基于等渗回归的常用校准方法（IsoRegC）[27]的扩展。ENIR旨在解决IsoRegC的关键局限性，即预测的单调性假设。与BBQ类似，该方法对二元分类器的输出进行后处理以获得校准概率。因此，它可以与许多现有的分类模型一起使用以生成准确的概率预测。我们在合成数据集和真实数据集上展示了ENIR对于常用二元分类模型的性能。实验结果表明，该方法优于几种常见的二元分类器校准方法。特别是在真实数据上，ENIR通常在统计上比其他方法表现得显著更好，且从不更差。它能够提高分类器的校准能力，同时保留其区分能力。该方法对于大规模数据集在计算上也是易于处理的，因为它的时间复杂度为（log），其中是样本数量。

相似文献

Binary Classifier Calibration using an Ensemble of Near Isotonic Regression Models.

Proc IEEE Int Conf Data Min. 2016 Dec;2016:360-369. doi: 10.1109/ICDM.2016.0047. Epub 2017 Feb 2.

Binary Classifier Calibration Using an Ensemble of Piecewise Linear Regression Models.

Knowl Inf Syst. 2018 Jan;54(1):151-170. doi: 10.1007/s10115-017-1133-2. Epub 2017 Nov 17.

Binary Classifier Calibration Using an Ensemble of Linear Trend Estimation.

Proc SIAM Int Conf Data Min. 2016 May;2016:261-269. doi: 10.1137/1.9781611974348.30.

Obtaining Well Calibrated Probabilities Using Bayesian Binning.

Proc AAAI Conf Artif Intell. 2015 Jan;2015:2901-2907.

Binary Classifier Calibration Using a Bayesian Non-Parametric Approach.

Proc SIAM Int Conf Data Min. 2015;2015:208-216. doi: 10.1137/1.9781611974010.24.

Targeting the uncertainty of predictions at patient-level using an ensemble of classifiers coupled with calibration methods, Venn-ABERS, and Conformal Predictors: A case study in AD.

J Biomed Inform. 2020 Jan;101:103350. doi: 10.1016/j.jbi.2019.103350. Epub 2019 Dec 6.

A probabilistic classifier ensemble weighting scheme based on cross-validated accuracy estimates.

Data Min Knowl Discov. 2019;33(6):1674-1709. doi: 10.1007/s10618-019-00638-y. Epub 2019 Jun 17.

An ensemble machine learning model based on multiple filtering and supervised attribute clustering algorithm for classifying cancer samples.

PeerJ Comput Sci. 2021 Sep 16;7:e671. doi: 10.7717/peerj-cs.671. eCollection 2021.

End-to-End Ensemble Learning by Exploiting the Correlation Between Individuals and Weights.

IEEE Trans Cybern. 2021 May;51(5):2835-2846. doi: 10.1109/TCYB.2019.2931071. Epub 2021 Apr 15.

A Correction Method of a Binary Classifier Applied to Multi-Label Pairwise Models.

Int J Neural Syst. 2018 Nov;28(9):1750062. doi: 10.1142/S0129065717500629. Epub 2017 Dec 17.

引用本文的文献

Calibrating a Transformer-Based Model's Confidence on Community-Engaged Research Studies: Decision Support Evaluation Study.

JMIR Form Res. 2023 Mar 20;7:e41516. doi: 10.2196/41516.

An Explainable Machine Learning Model for Material Backorder Prediction in Inventory Management.

Sensors (Basel). 2021 Nov 27;21(23):7926. doi: 10.3390/s21237926.

Binary Classifier Calibration Using an Ensemble of Piecewise Linear Regression Models.

Knowl Inf Syst. 2018 Jan;54(1):151-170. doi: 10.1007/s10115-017-1133-2. Epub 2017 Nov 17.

本文引用的文献

Binary Classifier Calibration Using a Bayesian Non-Parametric Approach.

Proc SIAM Int Conf Data Min. 2015;2015:208-216. doi: 10.1137/1.9781611974010.24.

Obtaining Well Calibrated Probabilities Using Bayesian Binning.

Proc AAAI Conf Artif Intell. 2015 Jan;2015:2901-2907.

Predicting accurate probabilities with a ranking loss.

Proc Int Conf Mach Learn. 2012;2012:703-710.

Calibrating predictive model estimates to support personalized medicine.

J Am Med Inform Assoc. 2012 Mar-Apr;19(2):263-74. doi: 10.1136/amiajnl-2011-000291. Epub 2011 Oct 7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用近等渗回归模型集成的二元分类器校准

Binary Classifier Calibration using an Ensemble of Near Isotonic Regression Models.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献