Yin Qiyue, Wu Shu, Wang Liang
IEEE Trans Neural Netw Learn Syst. 2018 Nov;29(11):5541-5553. doi: 10.1109/TNNLS.2017.2786743. Epub 2018 Mar 7.
Multiview clustering, which aims at using multiple distinct feature sets to boost clustering performance, has a wide range of applications. A subspace-based approach, a type of widely used methods, learns unified embedding from multiple sources of information and gives a relatively good performance. However, these methods usually ignore data similarity rankings; for example, example A may be more similar to B than C, and such similarity triplets may be more effective in revealing the data cluster structure. Motivated by recent embedding methods for modeling knowledge graph in natural-language processing, this paper proposes to mimic different views as different relations in a knowledge graph for unified and view-specific embedding learning. Moreover, in real applications, it happens so often that some views suffer from missing information, leading to incomplete multiview data. Under such a scenario, the performance of conventional multiview clustering degenerates notably, whereas the method we propose here can be naturally extended for incomplete multiview clustering, which enables full use of examples with incomplete feature sets for model promotion. Finally, we demonstrate through extensive experiments that our method performs better than the state-of-the-art clustering methods.
多视图聚类旨在利用多个不同的特征集来提升聚类性能,有着广泛的应用。基于子空间的方法是一类广泛使用的方法,它从多个信息源学习统一的嵌入并给出相对较好的性能。然而,这些方法通常忽略数据相似性排序;例如,示例A可能与B比与C更相似,并且这样的相似三元组可能在揭示数据聚类结构方面更有效。受自然语言处理中用于知识图谱建模的近期嵌入方法的启发,本文提出将不同视图模拟为知识图谱中的不同关系,以进行统一的和特定视图的嵌入学习。此外,在实际应用中,经常会出现一些视图存在信息缺失的情况,导致多视图数据不完整。在这种情况下,传统多视图聚类的性能会显著退化,而我们这里提出的方法可以自然地扩展用于不完整多视图聚类,这使得能够充分利用具有不完整特征集的示例来促进模型。最后,我们通过大量实验证明我们的方法比现有最先进的聚类方法表现更好。