Suppr超能文献

分类器不确定性:证据、潜在影响及概率性处理

Classifier uncertainty: evidence, potential impact, and probabilistic treatment.

作者信息

Tötsch Niklas, Hoffmann Daniel

机构信息

Faculty of Biology, University of Duisburg-Essen, Essen, Germany.

出版信息

PeerJ Comput Sci. 2021 Mar 4;7:e398. doi: 10.7717/peerj-cs.398. eCollection 2021.

Abstract

Classifiers are often tested on relatively small data sets, which should lead to uncertain performance metrics. Nevertheless, these metrics are usually taken at face value. We present an approach to quantify the uncertainty of classification performance metrics, based on a probability model of the confusion matrix. Application of our approach to classifiers from the scientific literature and a classification competition shows that uncertainties can be surprisingly large and limit performance evaluation. In fact, some published classifiers may be misleading. The application of our approach is simple and requires only the confusion matrix. It is agnostic of the underlying classifier. Our method can also be used for the estimation of sample sizes that achieve a desired precision of a performance metric.

摘要

分类器通常在相对较小的数据集上进行测试,这会导致性能指标存在不确定性。然而,这些指标通常被直接接受。我们提出了一种基于混淆矩阵概率模型来量化分类性能指标不确定性的方法。将我们的方法应用于科学文献中的分类器和一场分类竞赛中发现,不确定性可能大得出奇,并限制了性能评估。事实上,一些已发表的分类器可能会产生误导。我们的方法应用简单,只需要混淆矩阵。它与底层分类器无关。我们的方法还可用于估计实现所需性能指标精度的样本量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8226/7959610/c6fde6804704/peerj-cs-07-398-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验