Suppr超能文献

稀疏核均值聚类

Sparse kernel -means clustering.

作者信息

Park Beomjin, Park Changyi, Hong Sungchul, Choi Hosik

机构信息

Department of Information and Statistics, Gyeongsang National University, Jinju, South Korea.

Department of Statistics, University of Seoul, Seoul, South Korea.

出版信息

J Appl Stat. 2024 Jun 5;52(1):158-182. doi: 10.1080/02664763.2024.2362266. eCollection 2025.

Abstract

Clustering is an essential technique that groups similar data points to uncover the underlying structure and features of the data. Although traditional clustering methods such as -means are widely utilized, they have limitations in identifying nonlinear clusters. Thus, alternative techniques, such as kernel -means and spectral clustering, have been developed to address this issue. However, another challenge arises when irrelevant variables are present in the data; this can be mitigated by employing variable selection methods such as the filter, wrapper, and embedded approaches. In this study, with a particular focus on kernel -means clustering, we propose an embedded variable selection method using a tensor product space along with a general analysis of variance kernel for nonlinear clustering. Comprehensive experiments involving simulations and real data analysis demonstrated that the proposed method achieves competitive performance compared to existing approaches. Thus, the proposed method may serve as a reliable tool for accurate cluster identification and variable selection to gain insights into complex datasets.

摘要

聚类是一种重要技术,它将相似的数据点分组以揭示数据的潜在结构和特征。尽管诸如K均值等传统聚类方法被广泛使用,但它们在识别非线性聚类方面存在局限性。因此,已开发出替代技术,如核K均值和谱聚类来解决此问题。然而,当数据中存在无关变量时会出现另一个挑战;这可以通过采用诸如过滤、包装和嵌入方法等变量选择方法来缓解。在本研究中,特别关注核K均值聚类,我们提出一种使用张量积空间以及用于非线性聚类的广义方差分析核的嵌入变量选择方法。涉及模拟和实际数据分析的综合实验表明,与现有方法相比,所提出的方法具有竞争力。因此,所提出的方法可作为一种可靠工具,用于准确的聚类识别和变量选择,以深入了解复杂数据集。

相似文献

1
Sparse kernel -means clustering.稀疏核均值聚类
J Appl Stat. 2024 Jun 5;52(1):158-182. doi: 10.1080/02664763.2024.2362266. eCollection 2025.
2
Smooth Multiple Kernel k-Means via Underlying Graph Filtering.基于潜在图滤波的平滑多核k均值算法
IEEE Trans Neural Netw Learn Syst. 2025 Aug;36(8):14855-14868. doi: 10.1109/TNNLS.2025.3527120.
3
Multiple Kernel k-Means Clustering by Selecting Representative Kernels.通过选择代表性核进行多核k均值聚类
IEEE Trans Neural Netw Learn Syst. 2021 Nov;32(11):4983-4996. doi: 10.1109/TNNLS.2020.3026532. Epub 2021 Oct 27.
5
Weighted Mutual Information for Aggregated Kernel Clustering.聚合核聚类的加权互信息
Entropy (Basel). 2020 Mar 18;22(3):351. doi: 10.3390/e22030351.
6
Kernel Probabilistic K-Means Clustering.核概率 K-均值聚类。
Sensors (Basel). 2021 Mar 8;21(5):1892. doi: 10.3390/s21051892.
8
Vicinal support vector classifier using supervised kernel-based clustering.基于监督核聚类的邻接支持向量分类器。
Artif Intell Med. 2014 Mar;60(3):189-96. doi: 10.1016/j.artmed.2014.01.003. Epub 2014 Feb 7.
10
Implicit Annealing in Kernel Spaces: A Strongly Consistent Clustering Approach.核空间中的隐式退火:一种强一致性聚类方法。
IEEE Trans Pattern Anal Mach Intell. 2023 May;45(5):5862-5871. doi: 10.1109/TPAMI.2022.3217137. Epub 2023 Apr 3.

本文引用的文献

2
Multiclass cancer classification based on gene expression comparison.基于基因表达比较的多类癌症分类
Stat Appl Genet Mol Biol. 2014 Aug;13(4):477-96. doi: 10.1515/sagmb-2013-0053.
4
A framework for feature selection in clustering.一种用于聚类中特征选择的框架。
J Am Stat Assoc. 2010 Jun 1;105(490):713-726. doi: 10.1198/jasa.2010.tm09415.
10
Survey of clustering algorithms.聚类算法综述
IEEE Trans Neural Netw. 2005 May;16(3):645-78. doi: 10.1109/TNN.2005.845141.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验