Liu Xinwang, Li Miaomiao, Tang Chang, Xia Jingyuan, Xiong Jian, Liu Li, Kloft Marius, Zhu En
IEEE Trans Pattern Anal Mach Intell. 2021 Aug;43(8):2634-2646. doi: 10.1109/TPAMI.2020.2974828. Epub 2021 Jul 1.
Incomplete multi-view clustering (IMVC) optimally combines multiple pre-specified incomplete views to improve clustering performance. Among various excellent solutions, the recently proposed multiple kernel k-means with incomplete kernels (MKKM-IK) forms a benchmark, which redefines IMVC as a joint optimization problem where the clustering and kernel matrix imputation tasks are alternately performed until convergence. Though demonstrating promising performance in various applications, we observe that the manner of kernel matrix imputation in MKKM-IK would incur intensive computational and storage complexities, over-complicated optimization and limitedly improved clustering performance. In this paper, we first propose an Efficient and Effective Incomplete Multi-view Clustering (EE-IMVC) algorithm to address these issues. Instead of completing the incomplete kernel matrices, EE-IMVC proposes to impute each incomplete base matrix generated by incomplete views with a learned consensus clustering matrix. Moreover, we further improve this algorithm by incorporating prior knowledge to regularize the learned consensus clustering matrix. Two three-step iterative algorithms are carefully developed to solve the resultant optimization problems with linear computational complexity, and their convergence is theoretically proven. After that, we theoretically study the generalization bound of the proposed algorithms. Furthermore, we conduct comprehensive experiments to study the proposed algorithms in terms of clustering accuracy, evolution of the learned consensus clustering matrix and the convergence. As indicated, our algorithms deliver their effectiveness by significantly and consistently outperforming some state-of-the-art ones.
不完全多视图聚类(IMVC)通过最优地组合多个预先指定的不完全视图来提高聚类性能。在各种优秀的解决方案中,最近提出的具有不完全核的多核k均值(MKKM-IK)形成了一个基准,它将IMVC重新定义为一个联合优化问题,其中聚类和核矩阵插补任务交替执行直到收敛。尽管MKKM-IK在各种应用中表现出了有前景的性能,但我们观察到MKKM-IK中的核矩阵插补方式会带来密集的计算和存储复杂度、过度复杂的优化以及有限的聚类性能提升。在本文中,我们首先提出一种高效有效的不完全多视图聚类(EE-IMVC)算法来解决这些问题。EE-IMVC不是完成不完全核矩阵,而是提议用一个学习到的共识聚类矩阵来插补由不完全视图生成的每个不完全基矩阵。此外,我们通过纳入先验知识来正则化学习到的共识聚类矩阵,进一步改进了该算法。精心开发了两种三步迭代算法来解决具有线性计算复杂度的所得优化问题,并从理论上证明了它们的收敛性。之后,我们从理论上研究了所提出算法的泛化界。此外,我们进行了全面的实验,从聚类准确性、学习到的共识聚类矩阵的演化以及收敛性方面研究了所提出的算法。结果表明,我们的算法通过显著且持续地优于一些现有最先进算法,展现出了其有效性。