Li Miaomiao, Zhang Yi, Ma Chuan, Liu Suyuan, Liu Zhe, Yin Jianping, Liu Xinwang, Liao Qing
IEEE Trans Neural Netw Learn Syst. 2024 Nov;35(11):15910-15919. doi: 10.1109/TNNLS.2023.3290219. Epub 2024 Oct 29.
Multiple kernel clustering (MKC) aims to learn an optimal kernel to better serve for clustering from several precomputed basic kernels. Most MKC algorithms adhere to a common assumption that an optimal kernel is linearly combined by basic kernels. Based on a min-max framework, a newly proposed MKC method termed simple multiple kernel k -means (SimpleMKKM) can acquire a high-quality unified kernel. Although SimpleMKKM has achieved promising clustering performance, we observe that it cannot benefit from any prior knowledge. This would cause the learned partition matrix may seriously deviate from the expected one, especially in clustering tasks where the ground truth is absent during the learning course. To tackle this issue, we propose a novel algorithm termed regularized simple multiple kernel k -means with kernel average alignment (R-SMKKM-KAA). According to the experimental results of existing MKC algorithms, the average partition is a strong baseline to reflect true clustering. To gain knowledge from the average partition, we add the average alignment as a regularization term to prevent the learned unified partition from being far from the average partition. After that, we have designed an efficient solving algorithm to optimize the new resulting problem. In this way, both the incorporated prior knowledge and the combination of basic kernels are helpful to learn better unified partition. Consequently, the clustering performance can be significantly improved. Extensive experiments on nine common datasets have sufficiently demonstrated the effectiveness of incorporation of prior knowledge into SimpleMKKM.
多核聚类(MKC)旨在从多个预先计算的基本核中学习一个最优核,以便更好地服务于聚类。大多数MKC算法都遵循一个共同假设,即最优核是由基本核线性组合而成的。基于一个极小极大框架,一种新提出的名为简单多核k均值(SimpleMKKM)的MKC方法能够获得一个高质量的统一核。尽管SimpleMKKM已经取得了不错的聚类性能,但我们发现它无法从任何先验知识中受益。这可能会导致学习到的划分矩阵严重偏离预期矩阵,尤其是在学习过程中没有真实聚类情况的聚类任务中。为了解决这个问题,我们提出了一种名为带核平均对齐的正则化简单多核k均值(R-SMKKM-KAA)的新算法。根据现有MKC算法的实验结果,平均划分是反映真实聚类的一个强大基线。为了从平均划分中获取知识,我们添加平均对齐作为正则化项,以防止学习到的统一划分远离平均划分。之后,我们设计了一种高效的求解算法来优化新产生的问题。通过这种方式,融入的先验知识和基本核的组合都有助于学习到更好的统一划分。因此,聚类性能可以得到显著提高。在九个常见数据集上进行的大量实验充分证明了将先验知识融入SimpleMKKM的有效性。