基于对比学习的零样本检测的Transformer 方法。

Institute of Artificial Intelligence and Robotics, Xian Jiaotong University, Xian, Shaanxi 710049, P. R. China.

Int J Neural Syst. 2023 Jul;33(7):2350035. doi: 10.1142/S0129065723500351. Epub 2023 Jun 14.

Zero-shot detection (ZSD) aims to locate and classify unseen objects in pictures or videos by semantic auxiliary information without additional training examples. Most of the existing ZSD methods are based on two-stage models, which achieve the detection of unseen classes by aligning object region proposals with semantic embeddings. However, these methods have several limitations, including poor region proposals for unseen classes, lack of consideration of semantic representations of unseen classes or their inter-class correlations, and domain bias towards seen classes, which can degrade overall performance. To address these issues, the Trans-ZSD framework is proposed, which is a transformer-based multi-scale contextual detection framework that explicitly exploits inter-class correlations between seen and unseen classes and optimizes feature distribution to learn discriminative features. Trans-ZSD is a single-stage approach that skips proposal generation and performs detection directly, allowing the encoding of long-term dependencies at multiple scales to learn contextual features while requiring fewer inductive biases. Trans-ZSD also introduces a foreground-background separation branch to alleviate the confusion of unseen classes and backgrounds, contrastive learning to learn inter-class uniqueness and reduce misclassification between similar classes, and explicit inter-class commonality learning to facilitate generalization between related classes. Trans-ZSD addresses the domain bias problem in end-to-end generalized zero-shot detection (GZSD) models by using balance loss to maximize response consistency between seen and unseen predictions, ensuring that the model does not bias towards seen classes. The Trans-ZSD framework is evaluated on the PASCAL VOC and MS COCO datasets, demonstrating significant improvements over existing ZSD models.

零样本检测（ZSD）旨在通过语义辅助信息在图片或视频中定位和分类未见对象，而无需额外的训练示例。大多数现有的 ZSD 方法基于两阶段模型，通过将对象区域提议与语义嵌入对齐来实现未见类别的检测。然而，这些方法存在几个限制，包括对未见类别的区域提议不佳、缺乏对未见类别的语义表示或它们之间的类间相关性的考虑以及对可见类别的领域偏差，这可能会降低整体性能。为了解决这些问题，提出了 Trans-ZSD 框架，这是一个基于转换器的多尺度上下文检测框架，它明确利用了可见类和未见类之间的类间相关性，并优化特征分布以学习判别特征。Trans-ZSD 是一种单阶段方法，跳过了提案生成并直接执行检测，允许在多个尺度上对长期依赖关系进行编码，以学习上下文特征，同时需要较少的归纳偏差。Trans-ZSD 还引入了一个前景-背景分离分支，以减轻未见类和背景之间的混淆，对比学习以学习类间独特性并减少相似类之间的误分类，以及显式类间共性学习以促进相关类之间的泛化。Trans-ZSD 通过使用平衡损失来最大化可见和未见预测之间的响应一致性，解决了端到端广义零样本检测（GZSD）模型中的领域偏差问题，确保模型不会偏向可见类。在 PASCAL VOC 和 MS COCO 数据集上评估了 Trans-ZSD 框架，与现有的 ZSD 模型相比，该框架取得了显著的改进。

相似文献

Transformer-Based Approach Via Contrastive Learning for Zero-Shot Detection.

Int J Neural Syst. 2023 Jul;33(7):2350035. doi: 10.1142/S0129065723500351. Epub 2023 Jun 14.

Semantics-Guided Contrastive Network for Zero-Shot Object Detection.

IEEE Trans Pattern Anal Mach Intell. 2024 Mar;46(3):1530-1544. doi: 10.1109/TPAMI.2021.3140070. Epub 2024 Feb 6.

Polarity Loss: Improving Visual-Semantic Alignment for Zero-Shot Detection.

IEEE Trans Neural Netw Learn Syst. 2025 Mar;36(3):4066-4078. doi: 10.1109/TNNLS.2022.3184821. Epub 2025 Feb 28.

Contrastive Prototype-Guided Generation for Generalized Zero-Shot Learning.

Neural Netw. 2024 Aug;176:106324. doi: 10.1016/j.neunet.2024.106324. Epub 2024 Apr 15.

Synthesizing Knowledge-Enhanced Features for Real-World Zero-Shot Food Detection.

IEEE Trans Image Process. 2024;33:1285-1298. doi: 10.1109/TIP.2024.3360899. Epub 2024 Feb 13.

Semantics-Preserving Graph Propagation for Zero-Shot Object Detection.

IEEE Trans Image Process. 2020 Jul 30;PP. doi: 10.1109/TIP.2020.3011807.

Augmented semantic feature based generative network for generalized zero-shot learning.

Neural Netw. 2021 Nov;143:1-11. doi: 10.1016/j.neunet.2021.04.014. Epub 2021 Apr 21.

Dual Branch Multi-Level Semantic Learning for Few-Shot Segmentation.

IEEE Trans Image Process. 2024;33:1432-1447. doi: 10.1109/TIP.2024.3364056. Epub 2024 Feb 21.

Generative Multi-Label Zero-Shot Learning.

IEEE Trans Pattern Anal Mach Intell. 2023 Dec;45(12):14611-14624. doi: 10.1109/TPAMI.2023.3295772. Epub 2023 Nov 3.

Scalable Zero-Shot Learning via Binary Visual-Semantic Embeddings.

IEEE Trans Image Process. 2019 Feb 18. doi: 10.1109/TIP.2019.2899987.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

Transformer-Based Approach Via Contrastive Learning for Zero-Shot Detection.

Int J Neural Syst. 2023 Jul;33(7):2350035. doi: 10.1142/S0129065723500351. Epub 2023 Jun 14.

Semantics-Guided Contrastive Network for Zero-Shot Object Detection.

IEEE Trans Pattern Anal Mach Intell. 2024 Mar;46(3):1530-1544. doi: 10.1109/TPAMI.2021.3140070. Epub 2024 Feb 6.

Polarity Loss: Improving Visual-Semantic Alignment for Zero-Shot Detection.

IEEE Trans Neural Netw Learn Syst. 2025 Mar;36(3):4066-4078. doi: 10.1109/TNNLS.2022.3184821. Epub 2025 Feb 28.

Contrastive Prototype-Guided Generation for Generalized Zero-Shot Learning.

Neural Netw. 2024 Aug;176:106324. doi: 10.1016/j.neunet.2024.106324. Epub 2024 Apr 15.

Synthesizing Knowledge-Enhanced Features for Real-World Zero-Shot Food Detection.

IEEE Trans Image Process. 2024;33:1285-1298. doi: 10.1109/TIP.2024.3360899. Epub 2024 Feb 13.

Semantics-Preserving Graph Propagation for Zero-Shot Object Detection.

IEEE Trans Image Process. 2020 Jul 30;PP. doi: 10.1109/TIP.2020.3011807.

Augmented semantic feature based generative network for generalized zero-shot learning.

Neural Netw. 2021 Nov;143:1-11. doi: 10.1016/j.neunet.2021.04.014. Epub 2021 Apr 21.

Dual Branch Multi-Level Semantic Learning for Few-Shot Segmentation.

IEEE Trans Image Process. 2024;33:1432-1447. doi: 10.1109/TIP.2024.3364056. Epub 2024 Feb 21.

Generative Multi-Label Zero-Shot Learning.

IEEE Trans Pattern Anal Mach Intell. 2023 Dec;45(12):14611-14624. doi: 10.1109/TPAMI.2023.3295772. Epub 2023 Nov 3.

Scalable Zero-Shot Learning via Binary Visual-Semantic Embeddings.

IEEE Trans Image Process. 2019 Feb 18. doi: 10.1109/TIP.2019.2899987.

Transformer-Based Approach Via Contrastive Learning for Zero-Shot Detection.

机构信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献