Suppr超能文献

层次化深度点击特征预测在细粒度图像识别中的应用。

Hierarchical Deep Click Feature Prediction for Fine-Grained Image Recognition.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2022 Feb;44(2):563-578. doi: 10.1109/TPAMI.2019.2932058. Epub 2022 Jan 7.

Abstract

The click feature of an image, defined as the user click frequency vector of the image on a predefined word vocabulary, is known to effectively reduce the semantic gap for fine-grained image recognition. Unfortunately, user click frequency data are usually absent in practice. It remains challenging to predict the click feature from the visual feature, because the user click frequency vector of an image is always noisy and sparse. In this paper, we devise a Hierarchical Deep Word Embedding (HDWE) model by integrating sparse constraints and an improved RELU operator to address click feature prediction from visual features. HDWE is a coarse-to-fine click feature predictor that is learned with the help of an auxiliary image dataset containing click information. It can therefore discover the hierarchy of word semantics. We evaluate HDWE on three dog and one bird image datasets, in which Clickture-Dog and Clickture-Bird are utilized as auxiliary datasets to provide click data, respectively. Our empirical studies show that HDWE has 1) higher recognition accuracy, 2) a larger compression ratio, and 3) good one-shot learning ability and scalability to unseen categories.

摘要

图像的点击特征,定义为用户在预定义单词词汇上对图像的点击频率向量,被认为可以有效地缩小细粒度图像识别的语义鸿沟。不幸的是,在实践中通常缺少用户点击频率数据。从视觉特征预测点击特征仍然具有挑战性,因为图像的用户点击频率向量总是嘈杂和稀疏的。在本文中,我们设计了一种层次化深度词嵌入(HDWE)模型,通过集成稀疏约束和改进的 RELU 操作来解决从视觉特征预测点击特征的问题。HDWE 是一种从粗到细的点击特征预测器,它在包含点击信息的辅助图像数据集的帮助下进行学习。因此,它可以发现词语义的层次结构。我们在三个狗和一个鸟的图像数据集上评估了 HDWE,其中 Clickture-Dog 和 Clickture-Bird 分别被用作辅助数据集来提供点击数据。我们的实证研究表明,HDWE 具有以下特点:1)更高的识别准确率,2)更大的压缩比,3)对未见类别的良好单次学习能力和可扩展性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验