Li Jin, Lan Xuguang, Long Yang, Liu Yang, Chen Xingyu, Shao Ling, Zheng Nanning
IEEE Trans Image Process. 2020 Apr 15. doi: 10.1109/TIP.2020.2986892.
The fundamental problem of Zero-Shot Learning (ZSL) is that the one-hot label space is discrete, which leads to a complete loss of the relationships between seen and unseen classes. Conventional approaches rely on using semantic auxiliary information, e.g. attributes, to re-encode each class so as to preserve the inter-class associations. However, existing learning algorithms only focus on unifying visual and semantic spaces without jointly considering the label space. More importantly, because the final classification is conducted in the label space through a compatibility function, the gap between attribute and label spaces leads to significant performance degradation. Therefore, this paper proposes a novel pathway that uses the label space to jointly reconcile visual and semantic spaces directly, which is named Attributing Label Space (ALS). In the training phase, one-hot labels of seen classes are directly used as prototypes in a common space, where both images and attributes are mapped. Since mappings can be optimized independently, the computational complexity is extremely low. In addition, the correlation between semantic attributes has less influence on visual embedding training because features are mapped into labels instead of attributes. In the testing phase, the discrete condition of label space is removed, and priori one-hot labels are used to denote seen classes and further compose labels of unseen classes. Therefore, the label space is very discriminative for the Generalized ZSL (GZSL), which is more reasonable and challenging for real-world applications. Extensive experiments on five benchmarks manifest improved performance over all of compared state-of-the-art methods.
零样本学习(ZSL)的基本问题在于独热标签空间是离散的,这导致已见类别和未见类别之间的关系完全丧失。传统方法依赖于使用语义辅助信息(例如属性)对每个类别进行重新编码,以保留类间关联。然而,现有的学习算法仅专注于统一视觉空间和语义空间,而未共同考虑标签空间。更重要的是,由于最终分类是通过兼容性函数在标签空间中进行的,属性空间和标签空间之间的差距导致性能显著下降。因此,本文提出了一种新颖的方法,即直接使用标签空间来联合协调视觉空间和语义空间,该方法被命名为属性标签空间(ALS)。在训练阶段,已见类别的独热标签直接用作公共空间中的原型,图像和属性都映射到该空间。由于映射可以独立优化,计算复杂度极低。此外,语义属性之间的相关性对视觉嵌入训练的影响较小,因为特征被映射到标签而不是属性。在测试阶段,去除了标签空间的离散条件,使用先验独热标签来表示已见类别,并进一步构成未见类别的标签。因此,标签空间对于广义零样本学习(GZSL)具有很强的区分性,这对于实际应用来说更合理且更具挑战性。在五个基准数据集上进行的大量实验表明,与所有比较的现有最先进方法相比,性能有所提高。