Suppr超能文献

真实世界中分类准确率度量指标的应用挑战:从召回率和准确率到马修斯相关系数。

Challenges in the real world use of classification accuracy metrics: From recall and precision to the Matthews correlation coefficient.

机构信息

School of Geography, University of Nottingham, Nottingham, Nottinghamshire, United Kingdom.

出版信息

PLoS One. 2023 Oct 4;18(10):e0291908. doi: 10.1371/journal.pone.0291908. eCollection 2023.

Abstract

The accuracy of a classification is fundamental to its interpretation, use and ultimately decision making. Unfortunately, the apparent accuracy assessed can differ greatly from the true accuracy. Mis-estimation of classification accuracy metrics and associated mis-interpretations are often due to variations in prevalence and the use of an imperfect reference standard. The fundamental issues underlying the problems associated with variations in prevalence and reference standard quality are revisited here for binary classifications with particular attention focused on the use of the Matthews correlation coefficient (MCC). A key attribute claimed of the MCC is that a high value can only be attained when the classification performed well on both classes in a binary classification. However, it is shown here that the apparent magnitude of a set of popular accuracy metrics used in fields such as computer science medicine and environmental science (Recall, Precision, Specificity, Negative Predictive Value, J, F1, likelihood ratios and MCC) and one key attribute (prevalence) were all influenced greatly by variations in prevalence and use of an imperfect reference standard. Simulations using realistic values for data quality in applications such as remote sensing showed each metric varied over the range of possible prevalence and at differing levels of reference standard quality. The direction and magnitude of accuracy metric mis-estimation were a function of prevalence and the size and nature of the imperfections in the reference standard. It was evident that the apparent MCC could be substantially under- or over-estimated. Additionally, a high apparent MCC arose from an unquestionably poor classification. As with some other metrics of accuracy, the utility of the MCC may be overstated and apparent values need to be interpreted with caution. Apparent accuracy and prevalence values can be mis-leading and calls for the issues to be recognised and addressed should be heeded.

摘要

分类的准确性对于其解释、使用和最终决策至关重要。不幸的是,评估的准确性与真实准确性可能存在很大差异。分类准确性指标的错误估计和相关的错误解释通常是由于患病率的变化和使用不完美的参考标准引起的。本文重新探讨了与患病率和参考标准质量变化相关的问题的基本问题,特别关注二元分类中马氏相关系数(MCC)的使用。MCC 的一个关键属性是,只有在二元分类中对两个类别都表现良好时,才能获得高值。然而,本文表明,在计算机科学、医学和环境科学等领域使用的一组流行准确性指标(召回率、精度、特异性、负预测值、J、F1、似然比和 MCC)的表观幅度以及一个关键属性(患病率)都受到患病率变化和不完美参考标准的使用的极大影响。使用遥感等应用中数据质量的实际值进行的模拟表明,每个指标在可能的患病率范围内以及在不同水平的参考标准质量下都发生了变化。准确性指标错误估计的方向和幅度是患病率和参考标准中不完美的大小和性质的函数。显然,表观 MCC 可能被大大低估或高估。此外,表观 MCC 高来自于明显的分类不佳。与其他一些准确性指标一样,MCC 的效用可能被夸大了,表观值需要谨慎解释。表观准确性和患病率值可能具有误导性,应该注意到对这些问题的认识和解决的呼吁。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c7/10550141/b7ebd82b2fcf/pone.0291908.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验