Suppr超能文献

使用水平集方法的动态聚类形成

Dynamic cluster formation using level set methods.

作者信息

Yip Andy M, Ding Chris, Chan Tony F

机构信息

Department of Mathematics, National University of Singapore, 2, Science Drive 2, Singapore 117543, Singapore.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2006 Jun;28(6):877-89. doi: 10.1109/TPAMI.2006.117.

Abstract

Density-based clustering has the advantages for 1) allowing arbitrary shape of cluster and 2) not requiring the number of clusters as input. However, when clusters touch each other, both the cluster centers and cluster boundaries (as the peaks and valleys of the density distribution) become fuzzy and difficult to determine. We introduce the notion of cluster intensity function (CIF) which captures the important characteristics of clusters. When clusters are well-separated, CIFs are similar to density functions. But, when clusters become closed to each other, CIFs still clearly reveal cluster centers, cluster boundaries, and degree of membership of each data point to the cluster that it belongs. Clustering through bump hunting and valley seeking based on these functions are more robust than that based on density functions obtained by kernel density estimation, which are often oscillatory or oversmoothed. These problems of kernel density estimation are resolved using Level Set Methods and related techniques. Comparisons with two existing density-based methods, valley seeking and DBSCAN, are presented which illustrate the advantages of our approach.

摘要

基于密度的聚类具有以下优点

1)允许聚类具有任意形状;2)不需要将聚类数量作为输入。然而,当聚类相互接触时,聚类中心和聚类边界(作为密度分布的峰值和谷值)都会变得模糊且难以确定。我们引入了聚类强度函数(CIF)的概念,它捕捉了聚类的重要特征。当聚类分得很开时,CIF 类似于密度函数。但是,当聚类彼此靠近时,CIF 仍然能够清晰地揭示聚类中心、聚类边界以及每个数据点属于其所属聚类的隶属度。基于这些函数通过寻找峰值和谷值进行聚类比基于核密度估计得到的密度函数进行聚类更稳健,后者往往会出现振荡或过度平滑的情况。使用水平集方法和相关技术解决了核密度估计的这些问题。文中给出了与两种现有的基于密度的方法(谷值寻找和 DBSCAN)的比较,这说明了我们方法的优点。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验