• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于不平衡数据分类的具有偏好采样范式的神经网络

Neural Network With a Preference Sampling Paradigm for Imbalanced Data Classification.

作者信息

Huang Zhan Ao, Sang Yongsheng, Sun Yanan, Lv Jiancheng

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Jul;35(7):9252-9266. doi: 10.1109/TNNLS.2022.3231917. Epub 2024 Jul 8.

DOI:10.1109/TNNLS.2022.3231917
PMID:37018700
Abstract

Most data in real life are characterized by imbalance problems. One of the classic models for dealing with imbalanced data is neural networks. However, the data imbalance problem often causes the neural network to display negative class preference behavior. Using an undersampling strategy to reconstruct a balanced dataset is one of the methods to alleviate the data imbalance problem. However, most existing undersampling methods focus more on the data or aim to preserve the overall structural characteristics of the negative class through potential energy estimation, while the problems of gradient inundation and insufficient empirical representation of positive samples have not been well considered. Therefore, a new paradigm for solving the data imbalance problem is proposed. Specifically, to solve the problem of gradient inundation, an informative undersampling strategy is derived from the performance degradation and used to restore the ability of neural networks to work under imbalanced data. In addition, to alleviate the problem of insufficient empirical representation of positive samples, a boundary expansion strategy with linear interpolation and the prediction consistency constraint is considered. We tested the proposed paradigm on 34 imbalanced datasets with imbalance ratios ranging from 16.90 to 100.14. The test results show that our paradigm obtained the best area under the receiver operating characteristic curve (AUC) on 26 datasets.

摘要

现实生活中的大多数数据都存在不平衡问题。处理不平衡数据的经典模型之一是神经网络。然而,数据不平衡问题常常导致神经网络表现出负类偏好行为。使用欠采样策略来重建平衡数据集是缓解数据不平衡问题的方法之一。然而,大多数现有的欠采样方法更多地关注数据,或者旨在通过势能估计来保留负类的整体结构特征,而梯度淹没和正样本经验表示不足的问题尚未得到充分考虑。因此,提出了一种解决数据不平衡问题的新范式。具体来说,为了解决梯度淹没问题,从性能退化中推导了一种信息欠采样策略,并用于恢复神经网络在不平衡数据下的工作能力。此外,为了缓解正样本经验表示不足的问题,考虑了一种具有线性插值和预测一致性约束的边界扩展策略。我们在34个不平衡率从16.90到100.14的不平衡数据集上测试了所提出的范式。测试结果表明,我们的范式在26个数据集上获得了最佳的受试者工作特征曲线下面积(AUC)。

相似文献

1
Neural Network With a Preference Sampling Paradigm for Imbalanced Data Classification.用于不平衡数据分类的具有偏好采样范式的神经网络
IEEE Trans Neural Netw Learn Syst. 2024 Jul;35(7):9252-9266. doi: 10.1109/TNNLS.2022.3231917. Epub 2024 Jul 8.
2
Analysis of sampling techniques for imbalanced data: An n = 648 ADNI study.分析不平衡数据的采样技术:一项 n = 648 的 ADNI 研究。
Neuroimage. 2014 Feb 15;87:220-41. doi: 10.1016/j.neuroimage.2013.10.005. Epub 2013 Oct 29.
3
An empirical evaluation of sampling methods for the classification of imbalanced data.不平衡数据分类的采样方法的实证评估。
PLoS One. 2022 Jul 28;17(7):e0271260. doi: 10.1371/journal.pone.0271260. eCollection 2022.
4
Structure-activity relationship-based chemical classification of highly imbalanced Tox21 datasets.基于结构-活性关系的高度不平衡Tox21数据集的化学分类
J Cheminform. 2020 Oct 27;12(1):66. doi: 10.1186/s13321-020-00468-x.
5
A systematic study of the class imbalance problem in convolutional neural networks.卷积神经网络中类不平衡问题的系统研究。
Neural Netw. 2018 Oct;106:249-259. doi: 10.1016/j.neunet.2018.07.011. Epub 2018 Jul 29.
6
Class imbalance should not throw you off balance: Choosing the right classifiers and performance metrics for brain decoding with imbalanced data.不要被类别不平衡问题困扰:选择合适的分类器和性能指标,对不平衡数据进行脑解码。
Neuroimage. 2023 Aug 15;277:120253. doi: 10.1016/j.neuroimage.2023.120253. Epub 2023 Jun 28.
7
Batch-balanced focal loss: a hybrid solution to class imbalance in deep learning.批量平衡焦点损失:深度学习中类别不平衡问题的混合解决方案。
J Med Imaging (Bellingham). 2023 Sep;10(5):051809. doi: 10.1117/1.JMI.10.5.051809. Epub 2023 Jun 23.
8
Semi-supervised learning for medical image classification using imbalanced training data.基于不平衡训练数据的医学图像分类的半监督学习。
Comput Methods Programs Biomed. 2022 Apr;216:106628. doi: 10.1016/j.cmpb.2022.106628. Epub 2022 Jan 14.
9
Comparison of Resampling Techniques for Imbalanced Datasets in Machine Learning: Application to Epileptogenic Zone Localization From Interictal Intracranial EEG Recordings in Patients With Focal Epilepsy.机器学习中不平衡数据集的重采样技术比较:在局灶性癫痫患者发作间期颅内脑电图记录的致痫区定位中的应用
Front Neuroinform. 2021 Nov 19;15:715421. doi: 10.3389/fninf.2021.715421. eCollection 2021.
10
The receiver operating characteristic curve accurately assesses imbalanced datasets.受试者工作特征曲线能准确评估不均衡数据集。
Patterns (N Y). 2024 May 31;5(6):100994. doi: 10.1016/j.patter.2024.100994. eCollection 2024 Jun 14.

引用本文的文献

1
Deep learning analysis of exercise stress electrocardiography for identification of significant coronary artery disease.用于识别严重冠状动脉疾病的运动应激心电图的深度学习分析
Front Artif Intell. 2025 Mar 17;8:1496109. doi: 10.3389/frai.2025.1496109. eCollection 2025.