无人工标注微调卷积神经网络图像检索。

Fine-Tuning CNN Image Retrieval with No Human Annotation.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2019 Jul;41(7):1655-1668. doi: 10.1109/TPAMI.2018.2846566. Epub 2018 Jun 12.

DOI:10.1109/TPAMI.2018.2846566

Abstract

Image descriptors based on activations of Convolutional Neural Networks (CNNs) have become dominant in image retrieval due to their discriminative power, compactness of representation, and search efficiency. Training of CNNs, either from scratch or fine-tuning, requires a large amount of annotated data, where a high quality of annotation is often crucial. In this work, we propose to fine-tune CNNs for image retrieval on a large collection of unordered images in a fully automated manner. Reconstructed 3D models obtained by the state-of-the-art retrieval and structure-from-motion methods guide the selection of the training data. We show that both hard-positive and hard-negative examples, selected by exploiting the geometry and the camera positions available from the 3D models, enhance the performance of particular-object retrieval. CNN descriptor whitening discriminatively learned from the same training data outperforms commonly used PCA whitening. We propose a novel trainable Generalized-Mean (GeM) pooling layer that generalizes max and average pooling and show that it boosts retrieval performance. Applying the proposed method to the VGG network achieves state-of-the-art performance on the standard benchmarks: Oxford Buildings, Paris, and Holidays datasets.

摘要

基于卷积神经网络 (CNN) 激活的图像描述符由于其判别能力、表示的紧凑性和搜索效率，在图像检索中占据主导地位。CNN 的训练，无论是从头开始还是微调，都需要大量的标注数据，而标注的质量往往至关重要。在这项工作中，我们提出了一种完全自动化的方法，通过对大量无序图像进行微调来进行图像检索。通过最先进的检索和运动结构方法获得的重建 3D 模型指导训练数据的选择。我们表明，通过利用 3D 模型中可用的几何形状和相机位置选择的硬正例和硬负例，增强了特定对象检索的性能。从相同的训练数据中学习到的具有判别力的 CNN 描述符白化优于常用的 PCA 白化。我们提出了一种新的可训练的广义均值 (GeM) 池化层，它可以推广最大池化和平均池化，并表明它可以提高检索性能。将所提出的方法应用于 VGG 网络在标准基准（牛津建筑、巴黎和假日数据集）上实现了最先进的性能。

相似文献

Fine-Tuning CNN Image Retrieval with No Human Annotation.无人工标注微调卷积神经网络图像检索。

IEEE Trans Pattern Anal Mach Intell. 2019 Jul;41(7):1655-1668. doi: 10.1109/TPAMI.2018.2846566. Epub 2018 Jun 12.

A novel feature representation: Aggregating convolution kernels for image retrieval.一种新颖的特征表示：聚合卷积核进行图像检索。

Neural Netw. 2020 Oct;130:1-10. doi: 10.1016/j.neunet.2020.06.010. Epub 2020 Jun 24.

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.空间金字塔池化在深度卷积网络中的视觉识别。

IEEE Trans Pattern Anal Mach Intell. 2015 Sep;37(9):1904-16. doi: 10.1109/TPAMI.2015.2389824.

Visualization Methods for Image Transformation Convolutional Neural Networks.图像变换卷积神经网络的可视化方法。

IEEE Trans Neural Netw Learn Syst. 2019 Jul;30(7):2231-2243. doi: 10.1109/TNNLS.2018.2881194. Epub 2018 Dec 11.

An Improved Convolutional Neural Network Algorithm and Its Application in Multilabel Image Labeling.改进的卷积神经网络算法及其在多标签图像标记中的应用。

Comput Intell Neurosci. 2019 Jul 4;2019:2060796. doi: 10.1155/2019/2060796. eCollection 2019.

Convolutional Neural Networks for Medical Image Analysis: Full Training or Fine Tuning?卷积神经网络在医学图像分析中的应用：全训练还是微调？

IEEE Trans Med Imaging. 2016 May;35(5):1299-1312. doi: 10.1109/TMI.2016.2535302. Epub 2016 Mar 7.

Deep CNNs Meet Global Covariance Pooling: Better Representation and Generalization.深度卷积神经网络与全局协方差池化：更好的表示和泛化。

IEEE Trans Pattern Anal Mach Intell. 2021 Aug;43(8):2582-2597. doi: 10.1109/TPAMI.2020.2974833. Epub 2021 Jul 1.

Trunk-Branch Ensemble Convolutional Neural Networks for Video-Based Face Recognition.基于主干-分支集成卷积神经网络的视频人脸识别。

IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):1002-1014. doi: 10.1109/TPAMI.2017.2700390. Epub 2017 May 2.

Self-organized operational neural networks for severe image restoration problems.自组织操作型神经网络用于严重图像恢复问题。

Neural Netw. 2021 Mar;135:201-211. doi: 10.1016/j.neunet.2020.12.014. Epub 2020 Dec 23.

Multiple Discrimination and Pairwise CNN for view-based 3D object retrieval.基于视图的 3D 目标检索的多重判别和成对卷积神经网络。

Neural Netw. 2020 May;125:290-302. doi: 10.1016/j.neunet.2020.02.017. Epub 2020 Feb 29.

引用本文的文献

Unified interest point detection and description for perspective and Fisheye images.用于透视图像和鱼眼图像的统一兴趣点检测与描述

Sci Rep. 2025 Jul 1;15(1):21458. doi: 10.1038/s41598-025-02487-w.

Multi-Domain Indoor Dataset for Visual Place Recognition and Anomaly Detection by Mobile Robots.用于移动机器人视觉场所识别和异常检测的多领域室内数据集

Sci Data. 2025 May 19;12(1):817. doi: 10.1038/s41597-025-05124-3.

Impact of fine-tuning parameters of convolutional neural network for skin cancer detection.用于皮肤癌检测的卷积神经网络微调参数的影响。

Sci Rep. 2025 Apr 28;15(1):14779. doi: 10.1038/s41598-025-99529-0.

Content-Based Histopathological Image Retrieval.基于内容的组织病理学图像检索

Sensors (Basel). 2025 Feb 22;25(5):1350. doi: 10.3390/s25051350.

Multi-grained pooling network for age estimation in degraded low-resolution images.用于退化低分辨率图像年龄估计的多粒度池化网络。

Sci Rep. 2025 Mar 7;15(1):8030. doi: 10.1038/s41598-025-91845-9.

LoCS-Net: Localizing convolutional spiking neural network for fast visual place recognition.LoCS-Net：用于快速视觉场所识别的局部卷积脉冲神经网络

Front Neurorobot. 2025 Jan 29;18:1490267. doi: 10.3389/fnbot.2024.1490267. eCollection 2024.

A deep learning method based on multi-scale fusion for noise-resistant coal-gangue recognition.一种基于多尺度融合的抗噪声煤矸石识别深度学习方法。

Sci Rep. 2025 Jan 2;15(1):101. doi: 10.1038/s41598-024-83604-z.

DINO-Mix enhancing visual place recognition with foundational vision model and feature mixing.DINO-Mix通过基础视觉模型和特征混合增强视觉场所识别。

Sci Rep. 2024 Sep 27;14(1):22100. doi: 10.1038/s41598-024-73853-3.

GT-Net: global transformer network for multiclass brain tumor classification using MR images.GT-Net：用于使用磁共振图像进行多类脑肿瘤分类的全局变压器网络。

Biomed Eng Lett. 2024 May 31;14(5):1069-1077. doi: 10.1007/s13534-024-00393-0. eCollection 2024 Sep.

Cross-Modality Person Re-Identification Method with Joint-Modality Generation and Feature Enhancement.基于联合模态生成与特征增强的跨模态行人重识别方法

Entropy (Basel). 2024 Aug 13;26(8):681. doi: 10.3390/e26080681.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

无人工标注微调卷积神经网络图像检索。

Fine-Tuning CNN Image Retrieval with No Human Annotation.

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献