School of Computer Science and Engineering, University of Electronic Science and Technology of China, Sichuan, 611731, China.
College of Computer Science and Technology, Qingdao University, China.
Neural Netw. 2020 Feb;122:279-288. doi: 10.1016/j.neunet.2019.10.010. Epub 2019 Nov 6.
Multiview clustering has gained increasing attention recently due to its ability to deal with multiple sources (views) data and explore complementary information between different views. Among various methods, multiview subspace clustering methods provide encouraging performance. They mainly integrate the multiview information in the space where the data points lie. Hence, their performance may be deteriorated because of noises existing in each individual view or inconsistent between heterogeneous features. For multiview clustering, the basic premise is that there exists a shared partition among all views. Therefore, the natural space for multiview clustering should be all partitions. Orthogonal to existing methods, we propose to fuse multiview information in partition level following two intuitive assumptions: (i) each partition is a perturbation of the consensus clustering; (ii) the partition that is close to the consensus clustering should be assigned a large weight. Finally, we propose a unified multiview subspace clustering model which incorporates the graph learning from each view, the generation of basic partitions, and the fusion of consensus partition. These three components are seamlessly integrated and can be iteratively boosted by each other towards an overall optimal solution. Experiments on four benchmark datasets demonstrate the efficacy of our approach against the state-of-the-art techniques.
多视图聚类由于能够处理多个数据源(视图)数据并探索不同视图之间的互补信息,因此最近受到了越来越多的关注。在各种方法中,多视图子空间聚类方法提供了令人鼓舞的性能。它们主要在数据点所在的空间中集成多视图信息。因此,由于每个单独视图中的噪声或异类特征之间的不一致性,它们的性能可能会降低。对于多视图聚类,基本前提是所有视图之间存在共享分区。因此,多视图聚类的自然空间应该是所有分区。与现有方法正交,我们提出了在分区级别融合多视图信息,遵循两个直观的假设:(i)每个分区是共识聚类的扰动;(ii)接近共识聚类的分区应该被赋予较大的权重。最后,我们提出了一种统一的多视图子空间聚类模型,该模型结合了来自每个视图的图学习、基本分区的生成和共识分区的融合。这三个组件无缝集成,并可以通过彼此迭代提升,以获得整体最优解。在四个基准数据集上的实验表明,我们的方法在与最先进的技术相比时是有效的。