Suppr超能文献

基于进化马氏距离的多类不平衡数据分类过采样方法

Evolutionary Mahalanobis Distance-Based Oversampling for Multi-Class Imbalanced Data Classification.

作者信息

Yao Leehter, Lin Tung-Bin

机构信息

Department of Electrical Engineering, National Taipei University of Technology, Taipei 10618, Taiwan.

出版信息

Sensors (Basel). 2021 Oct 4;21(19):6616. doi: 10.3390/s21196616.

Abstract

The number of sensing data are often imbalanced across data classes, for which oversampling on the minority class is an effective remedy. In this paper, an effective oversampling method called evolutionary Mahalanobis distance oversampling (EMDO) is proposed for multi-class imbalanced data classification. EMDO utilizes a set of ellipsoids to approximate the decision regions of the minority class. Furthermore, multi-objective particle swarm optimization (MOPSO) is integrated with the Gustafson-Kessel algorithm in EMDO to learn the size, center, and orientation of every ellipsoid. Synthetic minority samples are generated based on Mahalanobis distance within every ellipsoid. The number of synthetic minority samples generated by EMDO in every ellipsoid is determined based on the density of minority samples in every ellipsoid. The results of computer simulations conducted herein indicate that EMDO outperforms most of the widely used oversampling schemes.

摘要

不同数据类别间的传感数据数量常常不均衡,对此对少数类进行过采样是一种有效的补救方法。本文提出了一种名为进化马氏距离过采样(EMDO)的有效过采样方法,用于多类不平衡数据分类。EMDO利用一组椭球体来近似少数类的决策区域。此外,在EMDO中,多目标粒子群优化(MOPSO)与古斯塔夫森 - 凯塞尔算法相结合,以学习每个椭球体的大小、中心和方向。基于每个椭球体内的马氏距离生成合成少数样本。EMDO在每个椭球体内生成的合成少数样本数量是根据每个椭球体内少数样本的密度来确定的。本文进行的计算机模拟结果表明,EMDO优于大多数广泛使用的过采样方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fdf2/8512012/c95bb1144ed3/sensors-21-06616-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验