• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

盲SMOTE:仅基于进化计算的合成少数类过采样技术

BlindSMOTE: Synthetic minority oversampling based only on evolutionary computation.

作者信息

Garcí-Pedrajas Nicolás E, Cuevas-Muñoz José M, de Haro-García Aida

机构信息

Department of Computer Science, University of Córdoba, 14071 Córdoba, Spain

出版信息

Evol Comput. 2025 Apr 16:1-35. doi: 10.1162/evco_a_00374.

DOI:10.1162/evco_a_00374
PMID:40299773
Abstract

One of the most common problems in data mining applications is the uneven distribution of classes, which appears in many real-world scenarios. The class of interest is often highly underrepresented in the given dataset, which harms the performance of most classifiers. One of the most successful methods for addressing the class imbalance problem is to oversample the minority class using synthetic samples. Since the original algorithm, the synthetic minority oversampling technique (SMOTE), introduced this method, numerous versions have emerged, each of which is based on a specific hypothesis about where and how to generate new synthetic instances. In this paper, we propose a different approach based exclusively on evolutionary computation that imposes no constraints on the creation of new synthetic instances. Majority class undersampling is also incorporated into the evolutionary process. A thorough comparison involving three classification methods, 85 datasets, and more than 90 class-imbalance strategies shows the advantages of our proposal.

摘要

数据挖掘应用中最常见的问题之一是类分布不均衡,这在许多现实世界场景中都会出现。在给定数据集中,感兴趣的类通常代表性严重不足,这会损害大多数分类器的性能。解决类不平衡问题最成功的方法之一是使用合成样本对少数类进行过采样。自从最初的算法——合成少数类过采样技术(SMOTE)引入这种方法以来,已经出现了许多版本,每个版本都基于关于在何处以及如何生成新的合成实例的特定假设。在本文中,我们提出了一种完全基于进化计算的不同方法,该方法对新合成实例的创建不设限制。多数类欠采样也被纳入进化过程。一项涉及三种分类方法、85个数据集和90多种类不平衡策略的全面比较显示了我们提议的优势。

相似文献

1
BlindSMOTE: Synthetic minority oversampling based only on evolutionary computation.盲SMOTE:仅基于进化计算的合成少数类过采样技术
Evol Comput. 2025 Apr 16:1-35. doi: 10.1162/evco_a_00374.
2
Interaction effect between data discretization and data resampling for class-imbalanced medical datasets.类别不均衡医学数据集的数据离散化与数据重采样之间的交互作用。
Technol Health Care. 2025 Mar;33(2):1000-1013. doi: 10.1177/09287329241295874. Epub 2024 Nov 25.
3
A Synthetic Minority Oversampling Technique Based on Gaussian Mixture Model Filtering for Imbalanced Data Classification.一种基于高斯混合模型滤波的合成少数类过采样技术用于不平衡数据分类
IEEE Trans Neural Netw Learn Syst. 2024 Mar;35(3):3740-3753. doi: 10.1109/TNNLS.2022.3197156. Epub 2024 Feb 29.
4
Structure-activity relationship-based chemical classification of highly imbalanced Tox21 datasets.基于结构-活性关系的高度不平衡Tox21数据集的化学分类
J Cheminform. 2020 Oct 27;12(1):66. doi: 10.1186/s13321-020-00468-x.
5
Outlier-SMOTE: A refined oversampling technique for improved detection of COVID-19.异常值合成少数过采样技术(Outlier-SMOTE):一种用于改进新冠病毒(COVID-19)检测的精细过采样技术。
Intell Based Med. 2020 Dec;3:100023. doi: 10.1016/j.ibmed.2020.100023. Epub 2020 Dec 3.
6
A novel method for detecting credit card fraud problems.一种用于检测信用卡欺诈问题的新方法。
PLoS One. 2024 Mar 6;19(3):e0294537. doi: 10.1371/journal.pone.0294537. eCollection 2024.
7
DBCSMOTE: a clustering-based oversampling technique for data-imbalanced warfarin dose prediction.DBCSMOTE:一种基于聚类的过采样技术,用于数据不平衡的华法林剂量预测。
BMC Med Genomics. 2020 Oct 22;13(Suppl 10):152. doi: 10.1186/s12920-020-00781-2.
8
SMOTE for high-dimensional class-imbalanced data.过采样处理高维类别不平衡数据。
BMC Bioinformatics. 2013 Mar 22;14:106. doi: 10.1186/1471-2105-14-106.
9
Addressing imbalance in health data: Synthetic minority oversampling using deep learning.解决健康数据不平衡问题:使用深度学习进行合成少数过采样
Comput Biol Med. 2025 Apr;188:109830. doi: 10.1016/j.compbiomed.2025.109830. Epub 2025 Feb 20.
10
Biased Random Forest For Dealing With the Class Imbalance Problem.用于处理类别不平衡问题的有偏随机森林
IEEE Trans Neural Netw Learn Syst. 2019 Jul;30(7):2163-2172. doi: 10.1109/TNNLS.2018.2878400. Epub 2018 Nov 20.