理解和增强内部聚类验证措施。

Understanding and enhancement of internal clustering validation measures.

机构信息

Department of Information Systems, New Jersey Institute of Technology, Newark, NJ 07102, USA.

出版信息

IEEE Trans Cybern. 2013 Jun;43(3):982-94. doi: 10.1109/TSMCB.2012.2220543. Epub 2012 Oct 26.

Abstract

Clustering validation has long been recognized as one of the vital issues essential to the success of clustering applications. In general, clustering validation can be categorized into two classes, external clustering validation and internal clustering validation. In this paper, we focus on internal clustering validation and present a study of 11 widely used internal clustering validation measures for crisp clustering. The results of this study indicate that these existing measures have certain limitations in different application scenarios. As an alternative choice, we propose a new internal clustering validation measure, named clustering validation index based on nearest neighbors (CVNN), which is based on the notion of nearest neighbors. This measure can dynamically select multiple objects as representatives for different clusters in different situations. Experimental results show that CVNN outperforms the existing measures on both synthetic data and real-world data in different application scenarios.

摘要

聚类验证长期以来一直被认为是聚类应用成功的关键问题之一。一般来说,聚类验证可以分为两类,外部聚类验证和内部聚类验证。在本文中,我们专注于内部聚类验证,并对 11 种广泛使用的硬聚类内部聚类验证方法进行了研究。研究结果表明,这些现有的方法在不同的应用场景下存在一定的局限性。作为一种替代选择,我们提出了一种新的内部聚类验证方法,命名为基于最近邻的聚类验证指标(CVNN),它基于最近邻的概念。该方法可以根据不同情况在不同的簇中动态选择多个对象作为代表。实验结果表明,CVNN 在不同的应用场景下的合成数据和真实世界数据上都优于现有的方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索