罗扎：一种用于评估分类系统的全新综合指标。

Roza: a new and comprehensive metric for evaluating classification systems.

作者信息

Melek Mesut, Melek Negin

机构信息

Department of Electronics and Automation, Gumushane University, Gumushane, Turkey.

Faculty of Engineering, Department of Electrical and Electronics Engineering, Avrasya University, Trabzon, Turkey.

出版信息

Comput Methods Biomech Biomed Engin. 2022 Jul;25(9):1015-1027. doi: 10.1080/10255842.2021.1995721. Epub 2021 Oct 25.

DOI:10.1080/10255842.2021.1995721

PMID:34693834

Abstract

Many metrics such as accuracy rate (ACC), area under curve (AUC), Jaccard index (JI), and Cohen's kappa coefficient are available to measure the success of the system in pattern recognition and machine/deep learning systems. However, the superiority of one system to one other cannot be determined based on the mentioned metrics. This is because such a system can be successful using one metric, but not the other ones. Moreover, such metrics are insufficient when the number of samples in the classes is unequal (imbalanced data). In this case, naturally, by using these metrics, a sensible comparison cannot be made between two given systems. In the present study, the comprehensive, fair, and accurate Roza (Roza means rose in Persian. When different permutations of the metrics used are superimposed in a polygon format, it looks like a flower, so we named it Roza.) metric is introduced for evaluating classification systems. This metric, which facilitates the comparison of systems, expresses the summary of many metrics with a single value. To verify the stability and validity of the metric and to conduct a comprehensive, fair, and accurate comparison between the systems, the Roza metric of the systems tested under the same conditions are calculated and comparisons are made. For this, systems tested with three different strategies on three different datasets are considered. The results show that the performance of the system can be summarized by a single value and the Roza metric can be used in all systems that include classification processes, as a powerful metric.

摘要

许多指标，如准确率（ACC）、曲线下面积（AUC）、杰卡德指数（JI）和科恩卡方系数，可用于衡量模式识别以及机器学习/深度学习系统中该系统的成功程度。然而，不能基于上述指标来确定一个系统相对于另一个系统的优越性。这是因为这样一个系统可能在一个指标上表现成功，但在其他指标上并非如此。此外，当类中的样本数量不相等（数据不平衡）时，这些指标是不够的。在这种情况下，自然地，使用这些指标无法对两个给定系统进行合理的比较。在本研究中，引入了全面、公平且准确的罗扎（罗扎在波斯语中意为玫瑰。当以多边形格式叠加所使用指标的不同排列时，它看起来像一朵花，所以我们将其命名为罗扎）指标来评估分类系统。这个指标便于系统之间的比较，用一个单一值表达了许多指标的汇总。为了验证该指标的稳定性和有效性，并在系统之间进行全面、公平且准确的比较，则计算在相同条件下测试的系统的罗扎指标并进行比较。为此，考虑了在三个不同数据集上采用三种不同策略进行测试的系统。结果表明，系统的性能可以用一个单一值来概括，并且罗扎指标作为一个强大的指标，可用于所有包含分类过程的系统。