School of Software Engineering, Shenzhen Institute of Information Technology, Shenzhen 518172, China.
State Key Laboratory of Integrated Services Networks, Xidian University, Shaanxi 710071, China.
Neural Netw. 2022 Apr;148:176-182. doi: 10.1016/j.neunet.2022.01.007. Epub 2022 Jan 29.
Many approaches in generalized zero-shot learning (GZSL) rely on cross-modal mapping between the image feature space and the class embedding space, which achieves knowledge transfer from seen to unseen classes. However, these two spaces are completely different space and their manifolds are inconsistent, the existing methods suffer from highly overlapped semantic description of different classes, as in GZSL tasks unseen classes can be easily misclassified into seen classes. To handle these problems, we adopt a novel semantic embedding network which helps to encode more discriminative information from initial semantic attributes to semantic embeddings in visual space. Meanwhile, a distribution alignment constraint is adopted to help keep the distribution of the learned semantic embeddings consistent with the distribution of real image features. Moreover, an auxiliary classifier is adopted to strengthen the quality of the learned semantic embeddings. Finally, a relation network is used to classify the unseen images by computing the relation scores between the semantic embeddings and image features, which is much more flexible than the fixed distance metric functions. Experimental results demonstrate that our proposed method is superior to other state-of-the-arts.
许多广义零样本学习 (GZSL) 方法依赖于图像特征空间和类别嵌入空间之间的跨模态映射,从而实现从可见类别到未见类别的知识转移。然而,这两个空间完全不同,它们的流形不一致,现有方法存在不同类别语义描述高度重叠的问题,在 GZSL 任务中,未见类别很容易被错误分类为可见类别。为了解决这些问题,我们采用了一种新颖的语义嵌入网络,有助于从初始语义属性编码到视觉空间中的语义嵌入中提取更具判别性的信息。同时,采用分布对齐约束有助于保持学习的语义嵌入的分布与真实图像特征的分布一致。此外,采用辅助分类器来增强学习的语义嵌入的质量。最后,通过计算语义嵌入和图像特征之间的关系得分,使用关系网络对未见图像进行分类,这比固定距离度量函数更加灵活。实验结果表明,我们提出的方法优于其他最先进的方法。