Suppr超能文献

核概率 K-均值聚类。

Kernel Probabilistic K-Means Clustering.

机构信息

Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.

School of Artificial Intelligence, Guilin University of Electronic Technology, Guilin 541004, China.

出版信息

Sensors (Basel). 2021 Mar 8;21(5):1892. doi: 10.3390/s21051892.

Abstract

Kernel fuzzy c-means (KFCM) is a significantly improved version of fuzzy c-means (FCM) for processing linearly inseparable datasets. However, for fuzzification parameter m=1, the problem of KFCM (kernel fuzzy c-means) cannot be solved by Lagrangian optimization. To solve this problem, an equivalent model, called kernel probabilistic k-means (KPKM), is proposed here. The novel model relates KFCM to kernel k-means (KKM) in a unified mathematic framework. Moreover, the proposed KPKM can be addressed by the active gradient projection (AGP) method, which is a nonlinear programming technique with constraints of linear equalities and linear inequalities. To accelerate the AGP method, a fast AGP (FAGP) algorithm was designed. The proposed FAGP uses a maximum-step strategy to estimate the step length, and uses an iterative method to update the projection matrix. Experiments demonstrated the effectiveness of the proposed method through a performance comparison of KPKM with KFCM, KKM, FCM and k-means. Experiments showed that the proposed KPKM is able to find nonlinearly separable structures in synthetic datasets. Ten real UCI datasets were used in this study, and KPKM had better clustering performance on at least six datsets. The proposed fast AGP requires less running time than the original AGP, and it reduced running time by 76-95% on real datasets.

摘要

核模糊 C 均值(KFCM)是模糊 C 均值(FCM)的一个重要改进版本,用于处理线性不可分离数据集。然而,对于模糊化参数 m=1,KFCM(核模糊 C 均值)的问题无法通过拉格朗日优化来解决。为了解决这个问题,这里提出了一个等价模型,称为核概率 k-均值(KPKM)。该新模型在统一的数学框架中将 KFCM 与核 k-均值(KKM)联系起来。此外,所提出的 KPKM 可以通过主动梯度投影(AGP)方法来解决,这是一种具有线性等式和线性不等式约束的非线性规划技术。为了加速 AGP 方法,设计了一种快速 AGP(FAGP)算法。所提出的 FAGP 使用最大步长策略来估计步长,并使用迭代方法来更新投影矩阵。通过将 KPKM 与 KFCM、KKM、FCM 和 k-均值的性能比较,实验证明了该方法的有效性。实验表明,所提出的 KPKM 能够在合成数据集上找到非线性可分离的结构。本研究使用了十个真实的 UCI 数据集,KPKM 在至少六个数据集上具有更好的聚类性能。所提出的快速 AGP 比原始 AGP 所需的运行时间更少,在真实数据集上的运行时间减少了 76-95%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ff4/7962817/fe34310ee04a/sensors-21-01892-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验