Suppr超能文献

基于自适应学习且对类别不平衡具有鲁棒性的最近邻分类器

Adaptive Learning-Based -Nearest Neighbor Classifiers With Resilience to Class Imbalance.

作者信息

Mullick Sankha Subhra, Datta Shounak, Das Swagatam

出版信息

IEEE Trans Neural Netw Learn Syst. 2018 Nov;29(11):5713-5725. doi: 10.1109/TNNLS.2018.2812279. Epub 2018 Mar 27.

Abstract

The classification accuracy of a -nearest neighbor ( NN) classifier is largely dependent on the choice of the number of nearest neighbors denoted by . However, given a data set, it is a tedious task to optimize the performance of NN by tuning . Moreover, the performance of NN degrades in the presence of class imbalance, a situation characterized by disparate representation from different classes. We aim to address both the issues in this paper and propose a variant of NN called the Adaptive NN (Ada- NN). The Ada- NN classifier uses the density and distribution of the neighborhood of a test point and learns a suitable point-specific for it with the help of artificial neural networks. We further improve our proposal by replacing the neural network with a heuristic learning method guided by an indicator of the local density of a test point and using information about its neighboring training points. The proposed heuristic learning algorithm preserves the simplicity of NN without incurring serious computational burden. We call this method Ada- NN2. Ada- NN and Ada- NN2 perform very competitive when compared with NN, five of NN's state-of-the-art variants, and other popular classifiers. Furthermore, we propose a class-based global weighting scheme (Global Imbalance Handling Scheme or GIHS) to compensate for the effect of class imbalance. We perform extensive experiments on a wide variety of data sets to establish the improvement shown by Ada- NN and Ada- NN2 using the proposed GIHS, when compared with NN, and its 12 variants specifically tailored for imbalanced classification.

摘要

k近邻(k-NN)分类器的分类准确率在很大程度上取决于由k表示的最近邻数量的选择。然而,对于给定的数据集,通过调整k来优化k-NN的性能是一项繁琐的任务。此外,在类别不平衡的情况下,即不同类别表示存在差异的情况下,k-NN的性能会下降。我们旨在解决本文中的这两个问题,并提出一种k-NN的变体,称为自适应k-NN(Ada-NN)。Ada-NN分类器利用测试点邻域的密度和分布,并借助人工神经网络为其学习合适的特定于点的k值。我们通过用由测试点局部密度指标引导的启发式学习方法替换神经网络,并使用其相邻训练点的信息来进一步改进我们的提议。所提出的启发式学习算法保留了k-NN的简单性,而不会带来严重的计算负担。我们将这种方法称为Ada-NN2。与k-NN、k-NN的五个最先进变体以及其他流行分类器相比,Ada-NN和Ada-NN2的表现非常有竞争力。此外,我们提出了一种基于类别的全局加权方案(全局不平衡处理方案或GIHS)来补偿类别不平衡的影响。我们在各种数据集上进行了广泛的实验,以确定与k-NN及其专门为不平衡分类量身定制的12个变体相比,使用所提出的GIHS时Ada-NN和Ada-NN-2所显示的改进。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验