Li Jingjing, Jing Mengmeng, Lu Ke, Zhu Lei, Shen Heng Tao
IEEE Trans Cybern. 2022 Aug;52(8):8167-8178. doi: 10.1109/TCYB.2021.3050803. Epub 2022 Jul 19.
Zero-shot learning (ZSL) is a pretty intriguing topic in the computer vision community since it handles novel instances and unseen categories. In a typical ZSL setting, there is a main visual space and an auxiliary semantic space. Most existing ZSL methods handle the problem by learning either a visual-to-semantic mapping or a semantic-to-visual mapping. In other words, they investigate a unilateral connection from one end to the other. However, the connection between the visual space and the semantic space are bilateral in reality, that is, the visual space depicts the semantic space; the semantic space, on the other hand, describes the visual space. In this article, therefore, we investigate the bilateral connections in ZSL and present a novel model, called Boomerang-GAN, by taking advantage of conditional generative adversarial networks (GANs). Specifically, we generate unseen visual samples from their category semantic embeddings by a conditional GAN. Different from the existing generative ZSL methods that only consider generating visual features from class descriptions, our method also considers that the generated visual features can be translated back to their corresponding semantic embeddings by introducing a multimodal cycle-consistent loss. Extensive experiments of both ZSL and generalized ZSL on five widely used datasets verify that our method is able to outperform previous state-of-the-art approaches in both recognition and segmentation tasks.
零样本学习(ZSL)在计算机视觉领域是一个相当有趣的话题,因为它能够处理新出现的实例和未见类别。在典型的ZSL设置中,存在一个主要的视觉空间和一个辅助的语义空间。大多数现有的ZSL方法通过学习视觉到语义的映射或语义到视觉的映射来处理该问题。换句话说,它们研究的是从一端到另一端的单向连接。然而,视觉空间和语义空间之间的连接在现实中是双向的,也就是说,视觉空间描绘语义空间;另一方面,语义空间描述视觉空间。因此,在本文中,我们研究了ZSL中的双向连接,并利用条件生成对抗网络(GAN)提出了一种名为回旋镖GAN的新型模型。具体来说,我们通过条件GAN从其类别语义嵌入中生成未见的视觉样本。与现有的仅考虑从类别描述生成视觉特征的生成式ZSL方法不同,我们的方法还考虑到通过引入多模态循环一致损失,生成的视觉特征可以被转换回其相应的语义嵌入。在五个广泛使用的数据集上进行的ZSL和广义ZSL的大量实验验证了我们的方法在识别和分割任务中均能优于先前的最先进方法。