Wang Qianqian, Ding Zhengming, Tao Zhiqiang, Gao Quanxue, Fu Yun
IEEE Trans Image Process. 2021;30:1771-1783. doi: 10.1109/TIP.2020.3048626. Epub 2021 Jan 14.
Nowadays, with the rapid development of data collection sources and feature extraction methods, multi-view data are getting easy to obtain and have received increasing research attention in recent years, among which, multi-view clustering (MVC) forms a mainstream research direction and is widely used in data analysis. However, existing MVC methods mainly assume that each sample appears in all the views, without considering the incomplete view case due to data corruption, sensor failure, equipment malfunction, etc. In this study, we design and build a generative partial multi-view clustering model with adaptive fusion and cycle consistency, named as GP-MVC, to solve the incomplete multi-view problem by explicitly generating the data of missing views. The main idea of GP-MVC lies in two-fold. First, multi-view encoder networks are trained to learn common low-dimensional representations, followed by a clustering layer to capture the shared cluster structure across multiple views. Second, view-specific generative adversarial networks with multi-view cycle consistency are developed to generate the missing data of one view conditioning on the shared representation given by other views. These two steps could be promoted mutually, where the learned common representation facilitates data imputation and the generated data could further explores the view consistency. Moreover, an weighted adaptive fusion scheme is implemented to exploit the complementary information among different views. Experimental results on four benchmark datasets are provided to show the effectiveness of the proposed GP-MVC over the state-of-the-art methods.
如今,随着数据收集源和特征提取方法的快速发展,多视图数据变得易于获取,并且近年来受到了越来越多的研究关注,其中多视图聚类(MVC)形成了一个主流研究方向,并广泛应用于数据分析。然而,现有的MVC方法主要假设每个样本都出现在所有视图中,而没有考虑由于数据损坏、传感器故障、设备故障等导致的视图不完整情况。在本研究中,我们设计并构建了一个具有自适应融合和循环一致性的生成式部分多视图聚类模型,名为GP-MVC,通过显式生成缺失视图的数据来解决不完整多视图问题。GP-MVC的主要思想有两个方面。首先,训练多视图编码器网络以学习共同的低维表示,随后是一个聚类层,以捕获跨多个视图的共享聚类结构。其次,开发具有多视图循环一致性的特定视图生成对抗网络,以根据其他视图给出的共享表示生成一个视图的缺失数据。这两个步骤可以相互促进,其中学习到的共同表示有助于数据插补,而生成的数据可以进一步探索视图一致性。此外,实施了一种加权自适应融合方案,以利用不同视图之间的互补信息。提供了在四个基准数据集上的实验结果,以表明所提出的GP-MVC相对于现有方法的有效性。