Ng Michael K, Li Mark Junjie, Huang Joshua Zhexue, He Zengyou
Department of Mathematics, Hong Kong Baptist University, Kowloon Tong, Hong Kong.
IEEE Trans Pattern Anal Mach Intell. 2007 Mar;29(3):503-7. doi: 10.1109/TPAMI.2007.53.
This correspondence describes extensions to the k-modes algorithm for clustering categorical data. By modifying a simple matching dissimilarity measure for categorical objects, a heuristic approach was developed in [4], [12] which allows the use of the k-modes paradigm to obtain a cluster with strong intrasimilarity and to efficiently cluster large categorical data sets. The main aim of this paper is to rigorously derive the updating formula of the k-modes clustering algorithm with the new dissimilarity measure and the convergence of the algorithm under the optimization framework.
本通信描述了用于对分类数据进行聚类的k-模式算法的扩展。通过修改用于分类对象的简单匹配差异度量,在[4]、[12]中开发了一种启发式方法,该方法允许使用k-模式范式来获得具有强内部相似性的聚类,并有效地对大型分类数据集进行聚类。本文的主要目的是在优化框架下,严格推导具有新差异度量的k-模式聚类算法的更新公式以及该算法的收敛性。