Department of Industrial and Systems Engineering Information,, Virginia Tech, Virgina, 24061, USA.
Department of Computer Science, Virginia Tech, Virginia, 24061, USA.
F1000Res. 2022 Apr 4;11:391. doi: 10.12688/f1000research.110567.2. eCollection 2022.
Conventional binary classification performance metrics evaluate either general measures (accuracy, F score) or specific aspects (precision, recall) of a model's classifying ability. As such, these metrics, derived from the model's confusion matrix, provide crucial insight regarding classifier-data interactions. However, modern- day computational capabilities have allowed for the creation of increasingly complex models that share nearly identical classification performance. While traditional performance metrics remain as essential indicators of a classifier's individual capabilities, their ability to differentiate between models is limited. In this paper, we present the methodology for MARS (Method for Assessing Relative Sensitivity/ Specificity) ShineThrough and MARS Occlusion scores, two novel binary classification performance metrics, designed to quantify the distinctiveness of a classifier's predictive successes and failures, relative to alternative classifiers. Being able to quantitatively express classifier uniqueness adds a novel classifier-classifier layer to the process of model evaluation and could improve ensemble model-selection decision making. By calculating both conventional performance measures, and proposed MARS metrics for a simple classifier prediction dataset, we demonstrate that the proposed metrics' informational strengths synergize well with those of traditional metrics, delivering insight complementary to that of conventional metrics.
传统的二分类性能指标评估模型分类能力的一般指标(准确性、F 分数)或特定方面(精度、召回率)。因此,这些源自模型混淆矩阵的指标提供了有关分类器与数据交互的关键见解。然而,现代计算能力已经允许创建越来越复杂的模型,这些模型的分类性能几乎相同。虽然传统的性能指标仍然是分类器个体能力的重要指标,但它们区分模型的能力有限。在本文中,我们介绍了 MARS(评估相对敏感性/特异性的方法)ShineThrough 和 MARS 遮挡分数的方法,这两个是新的二分类性能指标,旨在量化分类器预测成功和失败的独特性,相对于其他分类器。能够定量表达分类器的独特性为模型评估过程增加了一个新的分类器-分类器层,并可能改进集成模型选择决策。通过为简单的分类器预测数据集计算传统性能指标和提出的 MARS 指标,我们证明了所提出的指标的信息优势与传统指标很好地协同,提供了与传统指标互补的见解。