通过相似性传播实现零样本人类-物体交互检测

Zero-Shot Human-Object Interaction Detection via Similarity Propagation.

作者信息

Zong Daoming, Sun Shiliang

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Dec;35(12):17805-17816. doi: 10.1109/TNNLS.2023.3309104. Epub 2024 Dec 2.

DOI:10.1109/TNNLS.2023.3309104

Abstract

Human-object interaction (HOI) detection involves identifying interactions represented as , requiring the localization of human-object pairs and interaction classification within an image. This work focuses on the challenge of detecting HOIs with unseen objects using the prevalent Transformer architecture. Our empirical analysis reveals that the performance degradation of novel HOI instances primarily arises from misclassifying unseen objects as confusable seen objects. To address this issue, we propose a similarity propagation (SP) scheme that leverages cosine similarity distance to regulate the prediction margin between seen and unseen objects. In addition, we introduce pseudo-supervision for unseen objects based on class semantic similarities during training. Furthermore, we incorporate semantic-aware instance-level and interaction-level contrastive losses with Transformer to enhance intraclass compactness and interclass separability, resulting in improved visual representations. Extensive experiments on two challenging benchmarks, V-COCO and HICO-DET, demonstrate the effectiveness of our model, outperforming current state-of-the-art methods under various zero-shot settings.

摘要

人机交互（HOI）检测涉及识别表示为的交互，这需要在图像中定位人与物体对并进行交互分类。这项工作聚焦于使用流行的Transformer架构检测未见物体的人机交互这一挑战。我们的实证分析表明，新型HOI实例的性能下降主要源于将未见物体误分类为易混淆的可见物体。为解决此问题，我们提出一种相似性传播（SP）方案，该方案利用余弦相似性距离来调节可见和未见物体之间的预测边界。此外，我们在训练期间基于类语义相似性为未见物体引入伪监督。此外，我们将语义感知的实例级和交互级对比损失与Transformer相结合，以增强类内紧凑性和类间可分离性，从而改进视觉表示。在两个具有挑战性的基准V-COCO和HICO-DET上进行的广泛实验证明了我们模型的有效性，在各种零样本设置下优于当前的最先进方法。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

通过相似性传播实现零样本人类-物体交互检测

Zero-Shot Human-Object Interaction Detection via Similarity Propagation.

作者信息

出版信息

相似文献

通过相似性传播实现零样本人类-物体交互检测

Zero-Shot Human-Object Interaction Detection via Similarity Propagation.

作者信息

出版信息

相似文献