Wang Siwei, Liu Xinwang, Liu Li, Zhou Sihang, Zhu En
IEEE Trans Neural Netw Learn Syst. 2023 Aug;34(8):4359-4370. doi: 10.1109/TNNLS.2021.3117403. Epub 2023 Aug 4.
Multiple kernel clustering (MKC) optimally utilizes a group of pre-specified base kernels to improve clustering performance. Among existing MKC algorithms, the recently proposed late fusion MKC methods demonstrate promising clustering performance in various applications and enjoy considerable computational acceleration. However, we observe that the kernel partition learning and late fusion processes are separated from each other in the existing mechanism, which may lead to suboptimal solutions and adversely affect the clustering performance. In this article, we propose a novel late fusion multiple kernel clustering with proxy graph refinement (LFMKC-PGR) framework to address these issues. First, we theoretically revisit the connection between late fusion kernel base partition and traditional spectral embedding. Based on this observation, we construct a proxy self-expressive graph from kernel base partitions. The proxy graph in return refines the individual kernel partitions and also captures partition relations in graph structure rather than simple linear transformation. We also provide theoretical connections and considerations between the proposed framework and the multiple kernel subspace clustering. An alternate algorithm with proved convergence is then developed to solve the resultant optimization problem. After that, extensive experiments are conducted on 12 multi-kernel benchmark datasets, and the results demonstrate the effectiveness of our proposed algorithm. The code of the proposed algorithm is publicly available at https://github.com/wangsiwei2010/graphlatefusion_MKC.
多核聚类(MKC)通过最优地利用一组预先指定的基核来提高聚类性能。在现有的MKC算法中,最近提出的后期融合MKC方法在各种应用中展现出了良好的聚类性能,并且具有显著的计算加速效果。然而,我们观察到,在现有机制中,核划分学习和后期融合过程是相互分离的,这可能导致次优解,并对聚类性能产生不利影响。在本文中,我们提出了一种新颖的带代理图细化的后期融合多核聚类(LFMKC-PGR)框架来解决这些问题。首先,我们从理论上重新审视后期融合核基划分与传统谱嵌入之间的联系。基于这一观察结果,我们从核基划分构建一个代理自表达图。反过来,代理图细化了各个核划分,并且还在图结构中捕捉划分关系,而不是简单的线性变换。我们还给出了所提出框架与多核子空间聚类之间的理论联系和思考。然后开发了一种具有收敛性证明的交替算法来解决由此产生的优化问题。之后,在12个多核基准数据集上进行了广泛的实验,结果证明了我们所提出算法的有效性。所提出算法的代码可在https://github.com/wangsiwei2010/graphlatefusion_MKC上公开获取。