IEEE Trans Cybern. 2022 Oct;52(10):10490-10503. doi: 10.1109/TCYB.2021.3062830. Epub 2022 Sep 19.
Multiview clustering aims to leverage information from multiple views to improve the clustering performance. Most previous works assumed that each view has complete data. However, in real-world datasets, it is often the case that a view may contain some missing data, resulting in the problem of incomplete multiview clustering (IMC). Previous approaches to this problem have at least one of the following drawbacks: 1) employing shallow models, which cannot well handle the dependence and discrepancy among different views; 2) ignoring the hidden information of the missing data; and 3) being dedicated to the two-view case. To eliminate all these drawbacks, in this work, we present the adversarial IMC (AIMC) framework. In particular, AIMC seeks the common latent representation of multiview data for reconstructing raw data and inferring missing data. The elementwise reconstruction and the generative adversarial network are integrated to evaluate the reconstruction. They aim to capture the overall structure and get a deeper semantic understanding, respectively. Moreover, the clustering loss is designed to obtain a better clustering structure. We explore two variants of AIMC, namely: 1) autoencoder-based AIMC (AAIMC) and 2) generalized AIMC (GAIMC), with different strategies to obtain the multiview common representation. Experiments conducted on six real-world datasets show that AAIMC and GAIMC perform well and outperform the baseline methods.
多视图聚类旨在利用来自多个视图的信息来提高聚类性能。以前的大多数工作都假设每个视图都有完整的数据。然而,在现实世界的数据集,情况往往是一个视图可能包含一些缺失的数据,导致了不完全的多视图聚类(IMC)的问题。以前的方法有至少一个以下缺点:1)采用浅层模型,无法很好地处理不同视图之间的依赖关系和差异;2)忽略了缺失数据的隐藏信息;3)专门用于两视图情况。为了消除所有这些缺点,在这项工作中,我们提出了对抗性 IMC(AIMC)框架。特别是,AIMC 为多视图数据寻找共同的潜在表示,以重建原始数据和推断缺失数据。元素级别的重建和生成对抗网络被集成在一起进行评估。它们的目标分别是捕捉整体结构和获得更深层次的语义理解。此外,聚类损失被设计用来获得更好的聚类结构。我们探索了两种变体的 AIMC,即:1)基于自动编码器的 AIMC(AAIMC)和 2)广义 AIMC(GAIMC),它们采用不同的策略来获得多视图的公共表示。在六个真实世界数据集上的实验表明,AAIMC 和 GAIMC 表现良好,优于基线方法。