Suppr超能文献

基于深度集成层的可解释 Takagi-Sugeno-Kang 模糊分类器用于不平衡数据。

A Deep-Ensemble-Level-Based Interpretable Takagi-Sugeno-Kang Fuzzy Classifier for Imbalanced Data.

出版信息

IEEE Trans Cybern. 2022 May;52(5):3805-3818. doi: 10.1109/TCYB.2020.3016972. Epub 2022 May 19.

Abstract

Existing research reveals that the misclassification rate for imbalanced data depends heavily on the problematic areas due to the existence of small disjoints, class overlap, borderline, and rare data samples. In this study, by stacking zero-order Takagi-Sugeno-Kang (TSK) fuzzy subclassifiers on the minority class and its problematic areas in the deep ensemble, a novel deep-ensemble-level-based TSK fuzzy classifier (IDE-TSK-FC) for imbalanced data classification tasks is presented to achieve both promising classification performance and high interpretability of zero-order TSK fuzzy classifiers. Simultaneously, according to the stacked generalization principle, the proposed classifier lifts up oversampling from the data level to the deep ensemble level with a guarantee of enhanced generalization capability for class imbalance learning. In the structure of IDE-TSK-FC, the first interpretable zero-order TSK fuzzy subclassifier is built on the original training dataset. After that, several successive zero-order TSK fuzzy subclassifiers are stacked layer by layer on the newly identified problematic areas from the original training dataset plus the corresponding interpretable predictions obtained by the averaging strategy on all previous layers. IDE-TSK-FC simply takes the classical K -nearest neighboring algorithm at each layer to identify its problematic area that consists of the minority samples and its surrounding K majority neighbors. After randomly neglecting certain input features and randomly selecting the five Gaussian membership functions for all the chosen input features and the augmented feature in the premise of each fuzzy rule, each subclassifier can be quickly obtained by using the least learning machine to determine the consequent part of each fuzzy rule. The experimental results on both the public datasets and a real-world healthcare dataset demonstrate IDE-TSK-FC's superiority in class imbalanced learning.

摘要

现有研究表明,不平衡数据的分类错误率在很大程度上取决于问题区域,因为存在小的不连续、类重叠、边界和稀有数据样本。在这项研究中,通过在深度集成中的少数类及其问题区域上堆叠零阶 Takagi-Sugeno-Kang(TSK)模糊子分类器,提出了一种用于不平衡数据分类任务的新型基于深度集成级别的 TSK 模糊分类器(IDE-TSK-FC),以实现零阶 TSK 模糊分类器的有前途的分类性能和高可解释性。同时,根据堆叠泛化原理,所提出的分类器将过采样从数据级别提升到深度集成级别,保证了类不平衡学习的增强泛化能力。在 IDE-TSK-FC 的结构中,首先在原始训练数据集上构建第一个可解释的零阶 TSK 模糊子分类器。之后,在原始训练数据集上加上相应的平均策略获得的可解释预测,在新识别的问题区域上逐层次堆叠几个连续的零阶 TSK 模糊子分类器。IDE-TSK-FC 在每个层上简单地采用经典的 K-最近邻算法来识别其由少数样本及其周围 K 个多数邻居组成的问题区域。在每个模糊规则的前提中,随机忽略某些输入特征并随机选择五个高斯隶属函数用于所有选择的输入特征和扩充特征后,可以使用最小学习机器快速获得每个子分类器,以确定每个模糊规则的结论部分。在公共数据集和实际医疗保健数据集上的实验结果表明,IDE-TSK-FC 在类不平衡学习方面具有优越性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验