• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过稳健的局部描述符聚合来改进大规模图像检索。

Improving Large-Scale Image Retrieval Through Robust Aggregation of Local Descriptors.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2017 Sep;39(9):1783-1796. doi: 10.1109/TPAMI.2016.2613873. Epub 2016 Sep 27.

DOI:10.1109/TPAMI.2016.2613873
PMID:28114059
Abstract

Visual search and image retrieval underpin numerous applications, however the task is still challenging predominantly due to the variability of object appearance and ever increasing size of the databases, often exceeding billions of images. Prior art methods rely on aggregation of local scale-invariant descriptors, such as SIFT, via mechanisms including Bag of Visual Words (BoW), Vector of Locally Aggregated Descriptors (VLAD) and Fisher Vectors (FV). However, their performance is still short of what is required. This paper presents a novel method for deriving a compact and distinctive representation of image content called Robust Visual Descriptor with Whitening (RVD-W). It significantly advances the state of the art and delivers world-class performance. In our approach local descriptors are rank-assigned to multiple clusters. Residual vectors are then computed in each cluster, normalized using a direction-preserving normalization function and aggregated based on the neighborhood rank. Importantly, the residual vectors are de-correlated and whitened in each cluster before aggregation, leading to a balanced energy distribution in each dimension and significantly improved performance. We also propose a new post-PCA normalization approach which improves separability between the matching and non-matching global descriptors. This new normalization benefits not only our RVD-W descriptor but also improves existing approaches based on FV and VLAD aggregation. Furthermore, we show that the aggregation framework developed using hand-crafted SIFT features also performs exceptionally well with Convolutional Neural Network (CNN) based features. The RVD-W pipeline outperforms state-of-the-art global descriptors on both the Holidays and Oxford datasets. On the large scale datasets, Holidays1M and Oxford1M, SIFT-based RVD-W representation obtains a mAP of 45.1 and 35.1 percent, while CNN-based RVD-W achieve a mAP of 63.5 and 44.8 percent, all yielding superior performance to the state-of-the-art.

摘要

视觉搜索和图像检索是许多应用的基础,然而,由于目标外观的可变性和数据库规模的不断扩大(通常超过数十亿张图像),这个任务仍然具有挑战性。先前的艺术方法依赖于通过机制(例如,视觉单词袋(BoW)、局部聚集描述符向量(VLAD)和 Fisher 向量(FV))聚合局部尺度不变描述符,如 SIFT。然而,它们的性能仍然不如要求的那样好。本文提出了一种新的方法,用于从图像内容中提取紧凑而独特的表示,称为带白化的稳健视觉描述符(RVD-W)。它显著提高了现有技术的水平,并提供了世界级的性能。在我们的方法中,局部描述符被分配给多个聚类。然后在每个聚类中计算残差向量,使用保持方向的归一化函数进行归一化,并根据邻域等级进行聚合。重要的是,在聚合之前,在每个聚类中对残差向量进行去相关和白化,从而在每个维度上实现平衡的能量分布,并显著提高性能。我们还提出了一种新的 PCA 后归一化方法,该方法提高了匹配和非匹配全局描述符之间的可分离性。这种新的归一化不仅有利于我们的 RVD-W 描述符,而且还可以改进基于 FV 和 VLAD 聚合的现有方法。此外,我们表明,使用手工制作的 SIFT 特征开发的聚合框架也可以与基于卷积神经网络(CNN)的特征表现得非常出色。RVD-W 流水线在 Holidays 和 Oxford 数据集上都优于最先进的全局描述符。在大规模数据集 Holidays1M 和 Oxford1M 上,基于 SIFT 的 RVD-W 表示的 mAP 分别为 45.1%和 35.1%,而基于 CNN 的 RVD-W 则达到了 63.5%和 44.8%,所有这些都优于最先进的技术。

相似文献

1
Improving Large-Scale Image Retrieval Through Robust Aggregation of Local Descriptors.通过稳健的局部描述符聚合来改进大规模图像检索。
IEEE Trans Pattern Anal Mach Intell. 2017 Sep;39(9):1783-1796. doi: 10.1109/TPAMI.2016.2613873. Epub 2016 Sep 27.
2
Interferences in Match Kernels.匹配核中的干扰。
IEEE Trans Pattern Anal Mach Intell. 2017 Sep;39(9):1797-1810. doi: 10.1109/TPAMI.2016.2615621. Epub 2016 Oct 6.
3
Compact Representation of High-Dimensional Feature Vectors for Large-Scale Image Recognition and Retrieval.高维特征向量的紧凑表示在大规模图像识别和检索中的应用。
IEEE Trans Image Process. 2016 May;25(5):2407-19. doi: 10.1109/TIP.2016.2549360.
4
Mixture of Subspaces Image Representation and Compact Coding for Large-Scale Image Retrieval.子空间图像表示与紧凑编码的混合方法在大规模图像检索中的应用。
IEEE Trans Pattern Anal Mach Intell. 2015 Jul;37(7):1469-79. doi: 10.1109/TPAMI.2014.2382092.
5
Edge-SIFT: discriminative binary descriptor for scalable partial-duplicate mobile search.边缘 SIFT:可扩展的部分重复移动搜索的判别二进制描述符。
IEEE Trans Image Process. 2013 Jul;22(7):2889-902. doi: 10.1109/TIP.2013.2251650. Epub 2013 Mar 7.
6
SIFT-CNN: When Convolutional Neural Networks Meet Dense SIFT Descriptors for Image and Sequence Classification.SIFT-CNN:当卷积神经网络与密集SIFT描述符相遇用于图像和序列分类时。
J Imaging. 2022 Sep 21;8(10):256. doi: 10.3390/jimaging8100256.
7
Approximate Fisher Kernels of Non-iid Image Models for Image Categorization.非独立同分布图像模型的近似 Fisher 核在图像分类中的应用。
IEEE Trans Pattern Anal Mach Intell. 2016 Jun;38(6):1084-98. doi: 10.1109/TPAMI.2015.2484342. Epub 2015 Oct 1.
8
Fine-Tuning CNN Image Retrieval with No Human Annotation.无人工标注微调卷积神经网络图像检索。
IEEE Trans Pattern Anal Mach Intell. 2019 Jul;41(7):1655-1668. doi: 10.1109/TPAMI.2018.2846566. Epub 2018 Jun 12.
9
An effective content-based image retrieval technique for image visuals representation based on the bag-of-visual-words model.基于词汇袋模型的图像视觉表示的有效基于内容的图像检索技术。
PLoS One. 2018 Apr 25;13(4):e0194526. doi: 10.1371/journal.pone.0194526. eCollection 2018.
10
Selective Convolutional Descriptor Aggregation for Fine-Grained Image Retrieval.选择性卷积描述符聚合用于细粒度图像检索。
IEEE Trans Image Process. 2017 Jun;26(6):2868-2881. doi: 10.1109/TIP.2017.2688133. Epub 2017 Mar 27.

引用本文的文献

1
Do We Train on Test Data? Purging CIFAR of Near-Duplicates.我们在测试数据上进行训练吗?清除CIFAR中的近似重复数据。
J Imaging. 2020 Jun 2;6(6):41. doi: 10.3390/jimaging6060041.