Suppr超能文献

支持向量数据描述与k均值聚类:同一类?

Support Vector Data Descriptions and $k$ -Means Clustering: One Class?

作者信息

Gornitz Nico, Lima Luiz Alberto, Muller Klaus-Robert, Kloft Marius, Nakajima Shinichi

出版信息

IEEE Trans Neural Netw Learn Syst. 2018 Sep;29(9):3994-4006. doi: 10.1109/TNNLS.2017.2737941. Epub 2017 Sep 27.

Abstract

We present ClusterSVDD, a methodology that unifies support vector data descriptions (SVDDs) and $k$ -means clustering into a single formulation. This allows both methods to benefit from one another, i.e., by adding flexibility using multiple spheres for SVDDs and increasing anomaly resistance and flexibility through kernels to $k$ -means. In particular, our approach leads to a new interpretation of $k$ -means as a regularized mode seeking algorithm. The unifying formulation further allows for deriving new algorithms by transferring knowledge from one-class learning settings to clustering settings and vice versa. As a showcase, we derive a clustering method for structured data based on a one-class learning scenario. Additionally, our formulation can be solved via a particularly simple optimization scheme. We evaluate our approach empirically to highlight some of the proposed benefits on artificially generated data, as well as on real-world problems, and provide a Python software package comprising various implementations of primal and dual SVDD as well as our proposed ClusterSVDD.

摘要

我们提出了聚类支持向量数据描述(ClusterSVDD),这是一种将支持向量数据描述(SVDD)和k均值聚类统一为单一公式的方法。这使得两种方法能够相互受益,即通过为SVDD使用多个球体增加灵活性,并通过核函数提高k均值的抗异常能力和灵活性。特别是,我们的方法为k均值提供了一种新的解释,即作为一种正则化模式搜索算法。这种统一的公式还允许通过将知识从单类学习设置转移到聚类设置,反之亦然,从而推导出新的算法。作为一个展示,我们基于单类学习场景推导出一种用于结构化数据的聚类方法。此外,我们的公式可以通过一种特别简单的优化方案来求解。我们通过实证评估我们的方法,以突出在人工生成的数据以及现实世界问题上所提出的一些优点,并提供一个Python软件包,其中包括原始和对偶SVDD的各种实现以及我们提出的ClusterSVDD。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验