Suppr超能文献

在乘法框架中统一视觉属性学习与目标识别

Unifying Visual Attribute Learning with Object Recognition in a Multiplicative Framework.

作者信息

Liang Kongming, Chang Hong, Ma Bingpeng, Shan Shiguang, Chen Xilin

出版信息

IEEE Trans Pattern Anal Mach Intell. 2019 Jul;41(7):1747-1760. doi: 10.1109/TPAMI.2018.2836461. Epub 2018 Jun 4.

Abstract

Attributes are mid-level semantic properties of objects. Recent research has shown that visual attributes can benefit many typical learning problems in computer vision community. However, attribute learning is still a challenging problem as the attributes may not always be predictable directly from input images and the variation of visual attributes is sometimes large across categories. In this paper, we propose a unified multiplicative framework for attribute learning, which tackles the key problems. Specifically, images and category information are jointly projected into a shared feature space, where the latent factors are disentangled and multiplied to fulfil attribute prediction. The resulting attribute classifier is category-specific instead of being shared by all categories. Moreover, our model can leverage auxiliary data to enhance the predictive ability of attribute classifiers, which can reduce the effort of instance-level attribute annotation to some extent. By integrated into an existing deep learning framework, our model can both accurately predict attributes and learn efficient image representations. Experimental results show that our method achieves superior performance on both instance-level and category-level attribute prediction. For zero-shot learning based on visual attributes and human-object interaction recognition, our method can improve the state-of-the-art performance on several widely used datasets.

摘要

属性是对象的中级语义属性。最近的研究表明,视觉属性可以有益于计算机视觉领域中的许多典型学习问题。然而,属性学习仍然是一个具有挑战性的问题,因为属性可能并不总是可以直接从输入图像中预测出来,并且视觉属性在不同类别之间的变化有时很大。在本文中,我们提出了一种用于属性学习的统一乘法框架,该框架解决了关键问题。具体而言,图像和类别信息被联合投影到一个共享特征空间中,在该空间中潜在因子被解缠并相乘以实现属性预测。所得的属性分类器是特定于类别的,而不是由所有类别共享。此外,我们的模型可以利用辅助数据来增强属性分类器的预测能力,这可以在一定程度上减少实例级属性标注的工作量。通过集成到现有的深度学习框架中,我们的模型既可以准确地预测属性,又可以学习有效的图像表示。实验结果表明,我们的方法在实例级和类别级属性预测上均取得了优异的性能。对于基于视觉属性的零样本学习和人-物交互识别,我们的方法可以在几个广泛使用的数据集上提高当前的最优性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验