IEEE Trans Cybern. 2019 Apr;49(4):1391-1402. doi: 10.1109/TCYB.2018.2802453. Epub 2018 Feb 26.
In this paper, we develop a comprehensive conceptual and algorithmic framework to cope with a problem of clustering homogeneous information granules. While there have been several approaches to coping with granular (viz. non-numeric) data, the origin of granular data themselves considered there is somewhat unclear and, as a consequence, the results of clustering start lacking some full-fledged interpretation. In this paper, we offer a holistic view at clustering information granules and an evaluation of the results of clustering. We start with a process of forming information granules with the use of the principle of justifiable granularity (PJG). With this regard, we discuss a number of parameters used in this development of information granules as well as quantify the quality of the granules produced in this manner. In the sequel, Fuzzy C -Means is applied to cluster the derived information granules, which are represented in a parametric manner and associated with weights resulting from the usage of the PJG. The quality of clustering results is evaluated through the use of the reconstruction criterion (quantifying the concept of information granulation and degranulation). A suite of experiments using synthetic and publicly available datasets is reported to quantify the performance of the proposed approach and highlight its key features.
在本文中,我们开发了一个全面的概念和算法框架来处理聚类同质信息粒的问题。虽然已经有几种方法可以处理粒度(即非数字)数据,但粒度数据本身的来源有些不清楚,因此聚类的结果开始缺乏一些成熟的解释。在本文中,我们提供了一种整体的信息粒聚类视图和聚类结果的评估。我们从使用合理粒度原则(PJG)形成信息粒的过程开始。在这方面,我们讨论了在这种信息粒开发中使用的一些参数,并量化了以这种方式产生的粒的质量。随后,应用模糊 C -均值对以参数方式表示的派生信息粒进行聚类,并与使用 PJG 产生的权重相关联。通过使用重建标准(量化信息粒化和去粒化的概念)来评估聚类结果的质量。报告了一系列使用合成和公开可用数据集的实验,以量化所提出方法的性能并突出其关键特征。