Suppr超能文献

不平衡多类数据的分类性能评估。

Classification performance assessment for imbalanced multiclass data.

机构信息

School of Engineering, Pablo de Olavide University, 41013, Seville, Spain.

Department of Computer Networks and Systems, Silesian University of Technology, ul. Akademicka 16, 44-100, Gliwice, Poland.

出版信息

Sci Rep. 2024 May 10;14(1):10759. doi: 10.1038/s41598-024-61365-z.

Abstract

The evaluation of diagnostic systems is pivotal for ensuring the deployment of high-quality solutions, especially given the pronounced context-sensitivity of certain systems, particularly in fields such as biomedicine. Of notable importance are predictive models where the target variable can encompass multiple values (multiclass), especially when these classes exhibit substantial frequency disparities (imbalance). In this study, we introduce the Imbalanced Multiclass Classification Performance (IMCP) curve, specifically designed for multiclass datasets (unlike the ROC curve), and characterized by its resilience to class distribution variations (in contrast to accuracy or F -score). Moreover, the IMCP curve facilitates individual performance assessment for each class within the diagnostic system, shedding light on the confidence associated with each prediction-an aspect of particular significance in medical diagnosis. Empirical experiments conducted with real-world data in a multiclass context (involving 35 types of tumors) featuring a high level of imbalance demonstrate that both the IMCP curve and the area under the IMCP curve serve as excellent indicators of classification quality.

摘要

诊断系统的评估对于确保高质量解决方案的部署至关重要,特别是考虑到某些系统具有明显的上下文敏感性,特别是在生物医学等领域。预测模型尤为重要,其中目标变量可以包含多个值(多类),特别是当这些类别的频率差异较大(不平衡)时。在本研究中,我们引入了不平衡多类分类性能(IMCP)曲线,该曲线专门为多类数据集设计(与 ROC 曲线不同),其特点是对类分布变化具有弹性(与准确性或 F 分数不同)。此外,IMCP 曲线有助于对诊断系统中的每个类别进行个体性能评估,揭示与每个预测相关的置信度——这在医学诊断中尤为重要。在涉及高不平衡程度的多类环境(涉及 35 种肿瘤类型)中使用真实数据进行的实证实验表明,IMCP 曲线和 IMCP 曲线下的面积均是分类质量的极佳指标。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb0c/11087593/dd1610d1582f/41598_2024_61365_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验