混合分类器集成用于不平衡数据。

Hybrid Classifier Ensemble for Imbalanced Data.

出版信息

IEEE Trans Neural Netw Learn Syst. 2020 Apr;31(4):1387-1400. doi: 10.1109/TNNLS.2019.2920246. Epub 2019 Jun 28.

DOI:10.1109/TNNLS.2019.2920246

Abstract

The class imbalance problem has become a leading challenge. Although conventional imbalance learning methods are proposed to tackle this problem, they have some limitations: 1) undersampling methods suffer from losing important information and 2) cost-sensitive methods are sensitive to outliers and noise. To address these issues, we propose a hybrid optimal ensemble classifier framework that combines density-based undersampling and cost-effective methods through exploring state-of-the-art solutions using multi-objective optimization algorithm. Specifically, we first develop a density-based undersampling method to select informative samples from the original training data with probability-based data transformation, which enables to obtain multiple subsets following a balanced distribution across classes. Second, we exploit the cost-sensitive classification method to address the incompleteness of information problem via modifying weights of misclassified minority samples rather than the majority ones. Finally, we introduce a multi-objective optimization procedure and utilize connections between samples to self-modify the classification result using an ensemble classifier framework. Extensive comparative experiments conducted on real-world data sets demonstrate that our method outperforms the majority of imbalance and ensemble classification approaches.

摘要

类不平衡问题已成为主要挑战。尽管已经提出了传统的不平衡学习方法来解决这个问题，但它们存在一些局限性：1）欠采样方法会丢失重要信息，2）代价敏感方法对离群值和噪声敏感。为了解决这些问题，我们提出了一种混合最优集成分类器框架，该框架通过利用多目标优化算法探索最先进的解决方案，结合基于密度的欠采样和经济有效的方法。具体来说，我们首先开发了一种基于密度的欠采样方法，通过基于概率的数据转换从原始训练数据中选择有信息的样本，这使得能够在各个类之间获得具有平衡分布的多个子集。其次，我们利用代价敏感分类方法通过修改少数错分类样本的权重而不是多数样本的权重来解决信息不完整的问题。最后，我们引入了一种多目标优化过程，并利用样本之间的连接，使用集成分类器框架来自我修改分类结果。在真实数据集上进行的广泛比较实验表明，我们的方法优于大多数不平衡和集成分类方法。

相似文献

Hybrid Classifier Ensemble for Imbalanced Data.

IEEE Trans Neural Netw Learn Syst. 2020 Apr;31(4):1387-1400. doi: 10.1109/TNNLS.2019.2920246. Epub 2019 Jun 28.

Adaptive Fusion Based Method for Imbalanced Data Classification.

Front Neurorobot. 2022 Feb 28;16:827913. doi: 10.3389/fnbot.2022.827913. eCollection 2022.

Structure-activity relationship-based chemical classification of highly imbalanced Tox21 datasets.

J Cheminform. 2020 Oct 27;12(1):66. doi: 10.1186/s13321-020-00468-x.

A hybrid cost-sensitive ensemble for imbalanced breast thermogram classification.

Artif Intell Med. 2015 Nov;65(3):219-27. doi: 10.1016/j.artmed.2015.07.005. Epub 2015 Jul 31.

Ensemble learning with active example selection for imbalanced biomedical data classification.

IEEE/ACM Trans Comput Biol Bioinform. 2011 Mar-Apr;8(2):316-25. doi: 10.1109/TCBB.2010.96.

Hashing-Based Undersampling Ensemble for Imbalanced Pattern Classification Problems.

IEEE Trans Cybern. 2022 Feb;52(2):1269-1279. doi: 10.1109/TCYB.2020.3000754. Epub 2022 Feb 16.

Adaptive Subspace Optimization Ensemble Method for High-Dimensional Imbalanced Data Classification.

IEEE Trans Neural Netw Learn Syst. 2023 May;34(5):2284-2297. doi: 10.1109/TNNLS.2021.3106306. Epub 2023 May 2.

Hybrid k -Nearest Neighbor Classifier.

IEEE Trans Cybern. 2016 Jun;46(6):1263-75. doi: 10.1109/TCYB.2015.2443857. Epub 2015 Jun 26.

Embedding Undersampling Rotation Forest for Imbalanced Problem.

Comput Intell Neurosci. 2018 Nov 1;2018:6798042. doi: 10.1155/2018/6798042. eCollection 2018.

A hybrid ensemble and evolutionary algorithm for imbalanced classification and its application on bioinformatics.

Comput Biol Chem. 2022 Jun;98:107646. doi: 10.1016/j.compbiolchem.2022.107646. Epub 2022 Feb 23.

引用本文的文献

Learning from Imbalanced Data: Integration of Advanced Resampling Techniques and Machine Learning Models for Enhanced Cancer Diagnosis and Prognosis.

Cancers (Basel). 2024 Oct 8;16(19):3417. doi: 10.3390/cancers16193417.

Predicting non-chemotherapy drug-induced agranulocytosis toxicity through ensemble machine learning approaches.

Front Pharmacol. 2024 Aug 14;15:1431941. doi: 10.3389/fphar.2024.1431941. eCollection 2024.

Determination of the rat estrous cycle vased on EfficientNet.

Front Vet Sci. 2024 Jul 23;11:1434991. doi: 10.3389/fvets.2024.1434991. eCollection 2024.

Adaptive Fusion Based Method for Imbalanced Data Classification.

Front Neurorobot. 2022 Feb 28;16:827913. doi: 10.3389/fnbot.2022.827913. eCollection 2022.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

混合分类器集成用于不平衡数据。

Hybrid Classifier Ensemble for Imbalanced Data.

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献