Wang Qianqian, Tao Zhiqiang, Xia Wei, Gao Quanxue, Cao Xiaochun, Jiao Licheng
IEEE Trans Neural Netw Learn Syst. 2023 Oct;34(10):7635-7647. doi: 10.1109/TNNLS.2022.3145048. Epub 2023 Oct 5.
The existing deep multiview clustering (MVC) methods are mainly based on autoencoder networks, which seek common latent variables to reconstruct the original input of each view individually. However, due to the view-specific reconstruction loss, it is challenging to extract consistent latent representations over multiple views for clustering. To address this challenge, we propose adversarial MVC (AMvC) networks in this article. The proposed AMvC generates each view's samples conditioning on the fused latent representations among different views to encourage a more consistent clustering structure. Specifically, multiview encoders are used to extract latent descriptions from all the views, and the corresponding generators are used to generate the reconstructed samples. The discriminative networks and the mean squared loss are jointly utilized for training the multiview encoders and generators to balance the distinctness and consistency of each view's latent representation. Moreover, an adaptive fusion layer is developed to obtain a shared latent representation, on which a clustering loss and the l -norm constraint are further imposed to improve clustering performance and distinguish the latent space. Experimental results on video, image, and text datasets demonstrate that the effectiveness of our AMvC is over several state-of-the-art deep MVC methods.
现有的深度多视图聚类(MVC)方法主要基于自动编码器网络,该网络寻找共同的潜在变量来分别重建每个视图的原始输入。然而,由于特定于视图的重建损失,要在多个视图上提取一致的潜在表示以进行聚类具有挑战性。为应对这一挑战,我们在本文中提出了对抗性MVC(AMvC)网络。所提出的AMvC基于不同视图之间融合的潜在表示来生成每个视图的样本,以鼓励形成更一致的聚类结构。具体而言,多视图编码器用于从所有视图中提取潜在描述,相应的生成器用于生成重建样本。判别网络和均方损失被联合用于训练多视图编码器和生成器,以平衡每个视图潜在表示的差异性和一致性。此外,还开发了一个自适应融合层来获得共享的潜在表示,在其上进一步施加聚类损失和l -范数约束以提高聚类性能并区分潜在空间。在视频、图像和文本数据集上的实验结果表明,我们的AMvC比几种先进的深度MVC方法更有效。