一种新颖的基于成本敏感的方法，用于提高多层感知器在不平衡数据上的性能。

Novel cost-sensitive approach to improve the multilayer perceptron performance on imbalanced data.

出版信息

IEEE Trans Neural Netw Learn Syst. 2013 Jun;24(6):888-99. doi: 10.1109/TNNLS.2013.2246188.

DOI:10.1109/TNNLS.2013.2246188

Abstract

Traditional learning algorithms applied to complex and highly imbalanced training sets may not give satisfactory results when distinguishing between examples of the classes. The tendency is to yield classification models that are biased towards the overrepresented (majority) class. This paper investigates this class imbalance problem in the context of multilayer perceptron (MLP) neural networks. The consequences of the equal cost (loss) assumption on imbalanced data are formally discussed from a statistical learning theory point of view. A new cost-sensitive algorithm (CSMLP) is presented to improve the discrimination ability of (two-class) MLPs. The CSMLP formulation is based on a joint objective function that uses a single cost parameter to distinguish the importance of class errors. The learning rule extends the Levenberg-Marquadt's rule, ensuring the computational efficiency of the algorithm. In addition, it is theoretically demonstrated that the incorporation of prior information via the cost parameter may lead to balanced decision boundaries in the feature space. Based on the statistical analysis of results on real data, our approach shows a significant improvement of the area under the receiver operating characteristic curve and G-mean measures of regular MLPs.

摘要

当对类别示例进行区分时，应用于复杂且高度不平衡的训练集的传统学习算法可能无法给出满意的结果。其趋势是产生偏向于代表性过高（多数）类别的分类模型。本文在多层感知器（MLP）神经网络的上下文中研究了这种类别不平衡问题。从统计学习理论的角度，正式讨论了等代价（损失）假设对不平衡数据的影响。提出了一种新的基于代价敏感的算法（CSMLP）来提高（两类）MLP 的区分能力。CSMLP 公式基于一个联合目标函数，该函数使用单个代价参数来区分类错误的重要性。学习规则扩展了 Levenberg-Marquadt 规则，确保了算法的计算效率。此外，从理论上证明了通过代价参数合并先验信息可能导致特征空间中的平衡决策边界。基于对真实数据结果的统计分析，我们的方法在接收者操作特性曲线和常规 MLP 的 G-均值度量的面积方面显示出显著的改进。

相似文献

Novel cost-sensitive approach to improve the multilayer perceptron performance on imbalanced data.

IEEE Trans Neural Netw Learn Syst. 2013 Jun;24(6):888-99. doi: 10.1109/TNNLS.2013.2246188.

Supervised neural network modeling: an empirical investigation into learning from imbalanced data with labeling errors.

IEEE Trans Neural Netw. 2010 May;21(5):813-30. doi: 10.1109/TNN.2010.2042730. Epub 2010 Mar 15.

Learning to improve medical decision making from imbalanced data without a priori cost.

BMC Med Inform Decis Mak. 2014 Dec 5;14:111. doi: 10.1186/s12911-014-0111-9.

SVMs modeling for highly imbalanced classification.

IEEE Trans Syst Man Cybern B Cybern. 2009 Feb;39(1):281-8. doi: 10.1109/TSMCB.2008.2002909. Epub 2008 Dec 9.

Dynamic sampling approach to training neural networks for multiclass imbalance classification.

IEEE Trans Neural Netw Learn Syst. 2013 Apr;24(4):647-60. doi: 10.1109/TNNLS.2012.2228231.

Nature-Inspired Algorithm for Training Multilayer Perceptron Networks in e-health Environments for High-Risk Pregnancy Care.

J Med Syst. 2018 Feb 1;42(3):51. doi: 10.1007/s10916-017-0887-0.

DMP3: a dynamic multilayer perceptron construction algorithm.

Int J Neural Syst. 2001 Apr;11(2):145-65. doi: 10.1142/S0129065701000576.

A learning rule for very simple universal approximators consisting of a single layer of perceptrons.

Neural Netw. 2008 Jun;21(5):786-95. doi: 10.1016/j.neunet.2007.12.036. Epub 2007 Dec 31.

Cost-Sensitive Learning of Deep Feature Representations From Imbalanced Data.

IEEE Trans Neural Netw Learn Syst. 2018 Aug;29(8):3573-3587. doi: 10.1109/TNNLS.2017.2732482. Epub 2017 Aug 17.

Target discrimination in synthetic aperture radar using artificial neural networks.

IEEE Trans Image Process. 1998;7(8):1136-49. doi: 10.1109/83.704307.

引用本文的文献

Improving classification on imbalanced genomic data via KDE-based synthetic sampling.

BioData Min. 2025 Aug 29;18(1):60. doi: 10.1186/s13040-025-00474-5.

Evaluation of landslide susceptibility based on SMOTE-Tomek sampling and machine learning algorithm.

PLoS One. 2025 May 21;20(5):e0323487. doi: 10.1371/journal.pone.0323487. eCollection 2025.

Predicting individual perceptual scent impression from imbalanced dataset using mass spectrum of odorant molecules.

Sci Rep. 2022 Mar 8;12(1):3778. doi: 10.1038/s41598-022-07802-3.

Deep supervised learning using self-adaptive auxiliary loss for COVID-19 diagnosis from imbalanced CT images.

Neurocomputing (Amst). 2021 Oct 7;458:232-245. doi: 10.1016/j.neucom.2021.06.012. Epub 2021 Jun 7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种新颖的基于成本敏感的方法，用于提高多层感知器在不平衡数据上的性能。

Novel cost-sensitive approach to improve the multilayer perceptron performance on imbalanced data.

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献