一种基于粗糙集的用于更新分类聚类中模式的算法。

A rough set based algorithm for updating the modes in categorical clustering.

作者信息

Salem Semeh Ben, Naouali Sami, Chtourou Zied

机构信息

Science and Technologies for Defense (STD) Laboratory, Military Academy of Fondouk Jedid, Nabeul, Tunisia.

Polytechnic School of Tunisia, Rue El Khawarizmi, Al Marsá, B.P. 743, 2078 Tunis, Tunisia.

出版信息

Int J Mach Learn Cybern. 2021;12(7):2069-2090. doi: 10.1007/s13042-021-01293-w. Epub 2021 Mar 27.

DOI:10.1007/s13042-021-01293-w

PMID:33815625

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7998089/

Abstract

The categorical clustering problem has attracted much attention especially in the last decades since many real world applications produce categorical data. The -mode algorithm, proposed since 1998, and its multiple variants were widely used in this context. However, they suffer from a great limitation related to the update of the modes in each iteration. The mode in the last step of these algorithms is randomly selected although it is possible to identify many candidate ones. In this paper, a rough density mode selection method is proposed to identify the adequate modes among a list of candidate ones in each iteration of the -modes. The proposed method, called Density Rough -Modes (DR-M) was experimented using real world datasets extracted from the UCI Machine Learning Repository, the Global Terrorism Database (GTD) and a set of collected Tweets. The DRk-M was also compared to many states of the art clustering methods and has shown great efficiency.

摘要

分类聚类问题尤其在过去几十年中受到了广泛关注，因为许多现实世界的应用都会产生分类数据。自1998年提出的-k模式算法及其多种变体在这种情况下被广泛使用。然而，它们存在一个与每次迭代中模式更新相关的重大局限性。这些算法最后一步中的模式是随机选择的，尽管有可能识别出许多候选模式。在本文中，提出了一种粗糙密度模式选择方法，以在-k模式的每次迭代中的候选模式列表中识别出合适的模式。所提出的方法称为密度粗糙-k模式（DRk-M），使用从UCI机器学习库、全球恐怖主义数据库（GTD）和一组收集的推文提取的真实世界数据集进行了实验。DRk-M还与许多现有聚类方法进行了比较，并显示出了很高的效率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab8c/7998089/cef2c28645af/13042_2021_1293_Fig1_HTML.jpg

相似文献

A rough set based algorithm for updating the modes in categorical clustering.一种基于粗糙集的用于更新分类聚类中模式的算法。

Int J Mach Learn Cybern. 2021;12(7):2069-2090. doi: 10.1007/s13042-021-01293-w. Epub 2021 Mar 27.

Rough set based information theoretic approach for clustering uncertain categorical data.基于粗糙集的信息论聚类不确定分类数据方法。

PLoS One. 2022 May 13;17(5):e0265190. doi: 10.1371/journal.pone.0265190. eCollection 2022.

A Global-Relationship Dissimilarity Measure for the -Modes Clustering Algorithm.用于 - 模式聚类算法的全局关系差异度量

Comput Intell Neurosci. 2017;2017:3691316. doi: 10.1155/2017/3691316. Epub 2017 Mar 28.

An Algorithm for Clustering Categorical Data With Set-Valued Features.一种用于对具有集值特征的分类数据进行聚类的算法。

IEEE Trans Neural Netw Learn Syst. 2018 Oct;29(10):4593-4606. doi: 10.1109/TNNLS.2017.2770167. Epub 2017 Nov 29.

An Empirical Analysis of Rough Set Categorical Clustering Techniques.粗糙集分类聚类技术的实证分析

PLoS One. 2017 Jan 9;12(1):e0164803. doi: 10.1371/journal.pone.0164803. eCollection 2017.

R-Ensembler: A greedy rough set based ensemble attribute selection algorithm with kNN imputation for classification of medical data.R-Ensembler：一种基于粗糙集的贪婪集成属性选择算法，具有 kNN 插补功能，用于医学数据的分类。

Comput Methods Programs Biomed. 2020 Feb;184:105122. doi: 10.1016/j.cmpb.2019.105122. Epub 2019 Oct 8.

Clustering Categorical Data Using Community Detection Techniques.使用社区发现技术对分类数据进行聚类。

Comput Intell Neurosci. 2017;2017:8986360. doi: 10.1155/2017/8986360. Epub 2017 Dec 21.

Space Structure and Clustering of Categorical Data.空间结构与分类数据聚类。

IEEE Trans Neural Netw Learn Syst. 2016 Oct;27(10):2047-59. doi: 10.1109/TNNLS.2015.2451151. Epub 2015 Oct 2.

The impact of cluster representatives on the convergence of the k-modes type clustering.聚类代表对 k-均值聚类收敛性的影响。

IEEE Trans Pattern Anal Mach Intell. 2013 Jun;35(6):1509-22. doi: 10.1109/TPAMI.2012.228.

A novel artificial bee colony based clustering algorithm for categorical data.一种用于分类数据的基于新型人工蜂群的聚类算法。

PLoS One. 2015 May 20;10(5):e0127125. doi: 10.1371/journal.pone.0127125. eCollection 2015.

本文引用的文献

Hierarchical Clustering Multi-Task Learning for Joint Human Action Grouping and Recognition.层次聚类多任务学习用于联合人体动作分组和识别。

IEEE Trans Pattern Anal Mach Intell. 2017 Jan;39(1):102-114. doi: 10.1109/TPAMI.2016.2537337. Epub 2016 Mar 2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一种基于粗糙集的用于更新分类聚类中模式的算法。

A rough set based algorithm for updating the modes in categorical clustering.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献