Suppr超能文献

基于 L(p) 范数的可扩展图像检索。

L(p) -norm IDF for scalable image retrieval.

出版信息

IEEE Trans Image Process. 2014 Aug;23(8):3604-17. doi: 10.1109/TIP.2014.2329182. Epub 2014 Jun 5.

Abstract

The inverse document frequency (IDF) is prevalently utilized in the bag-of-words-based image retrieval application. The basic idea is to assign less weight to terms with high frequency, and vice versa. However, in the conventional IDF routine, the estimation of visual word frequency is coarse and heuristic. Therefore, its effectiveness is largely compromised and far from optimal. To address this problem, this paper introduces a novel IDF family by the use of Lp-norm pooling technique. Carefully designed, the proposed IDF considers the term frequency, document frequency, the complexity of images, as well as the codebook information. We further propose a parameter tuning strategy, which helps to produce optimal balancing between TF and pIDF weights, yielding the so-called Lp-norm IDF (pIDF). We show that the conventional IDF is a special case of our generalized version, and two novel IDFs, i.e., the average IDF and the max IDF, can be defined from the concept of pIDF. Further, by counting for the term-frequency in each image, the proposed pIDF helps to alleviate the visual word burstiness phenomenon. Our method is evaluated through extensive experiments on four benchmark data sets (Oxford 5K, Paris 6K, Holidays, and Ukbench). We show that the pIDF works well on large scale databases and when the codebook is trained on irrelevant data. We report an mean average precision improvement of as large as +13.0% over the baseline TF-IDF approach on a 1M data set. In addition, the pIDF has a wide application scope varying from buildings to general objects and scenes. When combined with postprocessing steps, we achieve competitive results compared with the state-of-the-art methods. In addition, since the pIDF is computed offline, no extra computation or memory cost is introduced to the system at all.

摘要

逆文档频率 (IDF) 在基于词汇袋的图像检索应用中被广泛使用。其基本思想是对高频词汇赋予较小的权重,反之亦然。然而,在传统的 IDF 方法中,视觉词汇频率的估计是粗糙的和启发式的。因此,其有效性在很大程度上受到了影响,远非最佳。为了解决这个问题,本文提出了一种新的 IDF 家族,利用 Lp 范数池化技术。精心设计的,所提出的 IDF 考虑了词频、文档频率、图像的复杂性以及码本信息。我们进一步提出了一种参数调整策略,有助于在 TF 和 pIDF 权重之间产生最佳平衡,从而产生所谓的 Lp 范数 IDF(pIDF)。我们表明,传统的 IDF 是我们广义版本的一个特例,并且两个新的 IDF,即平均 IDF 和最大 IDF,可以从 pIDF 的概念中定义。此外,通过对每个图像中的词频进行计数,所提出的 pIDF 有助于缓解视觉词汇突发性现象。我们的方法通过在四个基准数据集(牛津 5K、巴黎 6K、假期和 Ukbench)上进行广泛的实验进行了评估。我们表明,pIDF 在大规模数据库和码本是基于不相关数据训练时效果良好。我们在 100 万数据集中报告了高达+13.0%的平均准确率提升,优于基线 TF-IDF 方法。此外,pIDF 的应用范围很广,从建筑物到一般物体和场景都有涉及。当与后处理步骤结合使用时,我们与最先进的方法相比取得了有竞争力的结果。此外,由于 pIDF 是离线计算的,因此不会给系统带来额外的计算或内存开销。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验