Hou Zhengzhang, Li Zhanshan, Li Jingyao
College of Software, Jilin University, Changchun, 130012, Jilin, China; Key Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, Jilin University, Changchun, 130012, Jilin, China.
College of Software, Jilin University, Changchun, 130012, Jilin, China; College of Computer Science and Technology, Jilin University, Changchun, 130012, Jilin, China; Key Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, Jilin University, Changchun, 130012, Jilin, China.
Neural Netw. 2025 Aug;188:107423. doi: 10.1016/j.neunet.2025.107423. Epub 2025 Mar 29.
Generative zero-shot learning methods synthesize features for unseen classes by learning from image features and class semantic vectors, effectively addressing bias in transferring knowledge from seen to unseen classes. However, existing methods directly employ global image features without incorporating semantic information, failing to ensure that synthesized features for unseen classes maintain semantic consistency. This results in a lack of discriminative power for these synthesized features. To address these limitations, we propose a Bidirectional Semantic Consistency Guided (BSCG) generation model. The BSCG model utilizes a Bidirectional Semantic Guidance Framework (BSGF) that combines Attribute-to-Visual Guidance (AVG) and Visual-to-Attribute Guidance (VAG) to enhance interaction and mutual learning between visual features and attribute semantics. Additionally, we propose a Contrastive Consistency Space (CCS) to optimize feature quality further by improving intra-class compactness and inter-class separability. This approach ensures robust knowledge transfer and enhances the model's generalization ability. Extensive experiments on three benchmark datasets show that the BSCG model significantly outperforms existing state-of-the-art approaches in both conventional and generalized zero-shot learning settings. The codes are available at: https://github.com/ithicker/BSCG.
生成式零样本学习方法通过从图像特征和类别语义向量中学习,为未见类别合成特征,有效解决了从已见类别到未见类别知识转移中的偏差问题。然而,现有方法直接使用全局图像特征,未纳入语义信息,无法确保为未见类别合成的特征保持语义一致性。这导致这些合成特征缺乏判别力。为解决这些局限性,我们提出了一种双向语义一致性引导(BSCG)生成模型。BSCG模型利用双向语义引导框架(BSGF),该框架结合了属性到视觉引导(AVG)和视觉到属性引导(VAG),以增强视觉特征和属性语义之间的交互和相互学习。此外,我们提出了一个对比一致性空间(CCS),通过提高类内紧凑性和类间可分性来进一步优化特征质量。这种方法确保了稳健的知识转移,并增强了模型的泛化能力。在三个基准数据集上进行的大量广泛实验表明,在传统和广义零样本学习设置中,BSCG模型均显著优于现有的最先进方法。代码可在以下网址获取:https://github.com/ithicker/BSCG。