随机空间划分采样在标签噪声分类或不平衡分类中的应用。

Random Space Division Sampling for Label-Noisy Classification or Imbalanced Classification.

出版信息

IEEE Trans Cybern. 2022 Oct;52(10):10444-10457. doi: 10.1109/TCYB.2021.3070005. Epub 2022 Sep 19.

DOI:10.1109/TCYB.2021.3070005

Abstract

This article presents a simple sampling method, which is very easy to be implemented, for classification by introducing the idea of random space division, called "random space division sampling" (RSDS). It can extract the boundary points as the sampled result by efficiently distinguishing the label noise points, inner points, and boundary points. This makes it the first general sampling method for classification that not only can reduce the data size but also enhance the classification accuracy of a classifier, especially in the label-noisy classification. The "general" means that it is not restricted to any specific classifiers or datasets (regardless of whether a dataset is linear or not). Furthermore, the RSDS can online accelerate most classifiers because of its lower time complexity than most classifiers. Moreover, the RSDS can be used as an undersampling method for imbalanced classification. The experimental results on benchmark datasets demonstrate its effectiveness and efficiency. The code of the RSDS and comparison algorithms is available at: https://github.com/syxiaa/RSDS.

摘要

本文提出了一种简单的采样方法，通过引入随机空间划分的思想，称为“随机空间划分采样”（RSDS），可以有效地区分标签噪声点、内点和边界点，从而提取边界点作为采样结果。这使得它成为第一个通用的分类采样方法，不仅可以减小数据规模，而且可以提高分类器的分类精度，特别是在标签噪声分类中。“通用”是指它不受任何特定分类器或数据集的限制（无论数据集是否线性）。此外，由于 RSDS 的时间复杂度低于大多数分类器，因此它可以在线加速大多数分类器。此外，RSDS 可以用作不平衡分类的欠采样方法。在基准数据集上的实验结果证明了其有效性和效率。RSDS 和比较算法的代码可在：https://github.com/syxiaa/RSDS 获得。

相似文献

Random Space Division Sampling for Label-Noisy Classification or Imbalanced Classification.

IEEE Trans Cybern. 2022 Oct;52(10):10444-10457. doi: 10.1109/TCYB.2021.3070005. Epub 2022 Sep 19.

Granular Ball Sampling for Noisy Label Classification or Imbalanced Classification.

IEEE Trans Neural Netw Learn Syst. 2023 Apr;34(4):2144-2155. doi: 10.1109/TNNLS.2021.3105984. Epub 2023 Apr 4.

Clustering-based undersampling with random over sampling examples and support vector machine for imbalanced classification of breast cancer diagnosis.

Comput Assist Surg (Abingdon). 2019 Oct;24(sup2):62-72. doi: 10.1080/24699322.2019.1649074. Epub 2019 Aug 12.

A Noise-Filtered Under-Sampling Scheme for Imbalanced Classification.

IEEE Trans Cybern. 2017 Dec;47(12):4263-4274. doi: 10.1109/TCYB.2016.2606104. Epub 2016 Oct 12.

Protein classification with imbalanced data.

Proteins. 2008 Mar;70(4):1125-32. doi: 10.1002/prot.21870.

mCRF and mRD: Two Classification Methods Based on a Novel Multiclass Label Noise Filtering Learning Framework.

IEEE Trans Neural Netw Learn Syst. 2022 Jul;33(7):2916-2930. doi: 10.1109/TNNLS.2020.3047046. Epub 2022 Jul 6.

An empirical evaluation of sampling methods for the classification of imbalanced data.

PLoS One. 2022 Jul 28;17(7):e0271260. doi: 10.1371/journal.pone.0271260. eCollection 2022.

Conversion of adverse data corpus to shrewd output using sampling metrics.

Vis Comput Ind Biomed Art. 2020 Aug 11;3(1):19. doi: 10.1186/s42492-020-00055-9.

Comparison of Resampling Techniques for Imbalanced Datasets in Machine Learning: Application to Epileptogenic Zone Localization From Interictal Intracranial EEG Recordings in Patients With Focal Epilepsy.

Front Neuroinform. 2021 Nov 19;15:715421. doi: 10.3389/fninf.2021.715421. eCollection 2021.

Hybrid Incremental Ensemble Learning for Noisy Real-World Data Classification.

IEEE Trans Cybern. 2019 Feb;49(2):403-416. doi: 10.1109/TCYB.2017.2774266. Epub 2017 Dec 4.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

随机空间划分采样在标签噪声分类或不平衡分类中的应用。

Random Space Division Sampling for Label-Noisy Classification or Imbalanced Classification.

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献