基于反向归纳的深度图像搜索。

Backward induction-based deep image search.

机构信息

Department of Industrial Engineering, Yonsei University, Seoul, Republic of Korea.

出版信息

PLoS One. 2024 Sep 9;19(9):e0310098. doi: 10.1371/journal.pone.0310098. eCollection 2024.

DOI:10.1371/journal.pone.0310098

PMID:39250472

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11383237/

Abstract

Conditional image retrieval (CIR), which involves retrieving images by a query image along with user-specified conditions, is essential in computer vision research for efficient image search and automated image analysis. The existing approaches, such as composed image retrieval (CoIR) methods, have been actively studied. However, these methods face challenges as they require either a triplet dataset or richly annotated image-text pairs, which are expensive to obtain. In this work, we demonstrate that CIR at the image-level concept can be achieved using an inverse mapping approach that explores the model's inductive knowledge. Our proposed CIR method, called Backward Search, updates the query embedding to conform to the condition. Specifically, the embedding of the query image is updated by predicting the probability of the label and minimizing the difference from the condition label. This enables CIR with image-level concepts while preserving the context of the query. In this paper, we introduce the Backward Search method that enables single and multi-conditional image retrieval. Moreover, we efficiently reduce the computation time by distilling the knowledge. We conduct experiments using the WikiArt, aPY, and CUB benchmark datasets. The proposed method achieves an average mAP@10 of 0.541 on the datasets, demonstrating a marked improvement compared to the CoIR methods in our comparative experiments. Furthermore, by employing knowledge distillation with the Backward Search model as the teacher, the student model achieves a significant reduction in computation time, up to 160 times faster with only a slight decrease in performance. The implementation of our method is available at the following URL: https://github.com/dhlee-work/BackwardSearch.

摘要

条件图像检索（CIR），即通过查询图像和用户指定的条件检索图像，是计算机视觉研究中实现高效图像搜索和自动化图像分析的关键。目前已经有很多研究人员积极研究了组合图像检索（CoIR）方法，但这些方法面临着挑战，因为它们要么需要三元组数据集，要么需要大量注释的图像-文本对，而这些数据的获取成本很高。在这项工作中，我们证明了可以使用探索模型归纳知识的逆映射方法来实现图像级概念的 CIR。我们提出的 CIR 方法称为反向搜索，通过更新查询嵌入以符合条件来实现 CIR。具体来说，通过预测标签的概率并最小化与条件标签的差异来更新查询图像的嵌入。这使得可以在保留查询上下文的同时进行图像级概念的 CIR。在本文中，我们介绍了 Backward Search 方法，该方法可以实现单条件和多条件图像检索。此外，我们通过知识蒸馏有效地减少了计算时间。我们在 WikiArt、aPY 和 CUB 基准数据集上进行了实验。所提出的方法在这些数据集上的平均 mAP@10 达到 0.541，与我们的对比实验中的 CoIR 方法相比，有显著的改进。此外，通过使用 Backward Search 模型作为教师进行知识蒸馏，学生模型的计算时间可以显著减少，最快可以快 160 倍，而性能仅略有下降。我们的方法的实现可以在以下网址获得：https://github.com/dhlee-work/BackwardSearch。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/687d/11383237/ed1aa41dddbe/pone.0310098.g001.jpg

相似文献

Backward induction-based deep image search.基于反向归纳的深度图像搜索。

PLoS One. 2024 Sep 9;19(9):e0310098. doi: 10.1371/journal.pone.0310098. eCollection 2024.

A novel biomedical image indexing and retrieval system via deep preference learning.一种基于深度偏好学习的新型生物医学图像索引和检索系统。

Comput Methods Programs Biomed. 2018 May;158:53-69. doi: 10.1016/j.cmpb.2018.02.003. Epub 2018 Feb 6.

A deep metric learning approach for histopathological image retrieval.一种用于组织病理学图像检索的深度度量学习方法。

Methods. 2020 Jul 1;179:14-25. doi: 10.1016/j.ymeth.2020.05.015. Epub 2020 May 19.

A boosting framework for visuality-preserving distance metric learning and its application to medical image retrieval.一种保持视觉保真度的距离度量学习的提升框架及其在医学图像检索中的应用。

IEEE Trans Pattern Anal Mach Intell. 2010 Jan;32(1):30-44. doi: 10.1109/TPAMI.2008.273.

Triplet Deep Hashing with Joint Supervised Loss Based on Deep Neural Networks.基于深度神经网络的联合监督损失三重深度哈希。

Comput Intell Neurosci. 2019 Oct 9;2019:8490364. doi: 10.1155/2019/8490364. eCollection 2019.

CoVR-2: Automatic Data Construction for Composed Video Retrieval.

IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):11409-11421. doi: 10.1109/TPAMI.2024.3463799. Epub 2024 Nov 6.

Triplet-constrained deep hashing for chest X-ray image retrieval in COVID-19 assessment.基于三元组约束的深度学习哈希算法在 COVID-19 评估中用于胸部 X 射线图像检索。

Neural Netw. 2024 May;173:106182. doi: 10.1016/j.neunet.2024.106182. Epub 2024 Feb 16.

Echoes of images: multi-loss network for image retrieval in vision transformers.图像的回波：视觉Transformer 中用于图像检索的多损失网络。

Med Biol Eng Comput. 2024 Jul;62(7):2037-2058. doi: 10.1007/s11517-024-03055-6. Epub 2024 Mar 4.

Document/query expansion based on selecting significant concepts for context based retrieval of medical images.基于选择显著概念的文档/查询扩展，用于基于上下文的医学图像检索。

J Biomed Inform. 2019 Jul;95:103210. doi: 10.1016/j.jbi.2019.103210. Epub 2019 May 17.

A unified framework for image retrieval using keyword and visual features.一种使用关键词和视觉特征进行图像检索的统一框架。

IEEE Trans Image Process. 2005 Jul;14(7):979-89. doi: 10.1109/tip.2005.847289.

本文引用的文献

Deep Learning for Instance Retrieval: A Survey.深度学习实例检索综述

IEEE Trans Pattern Anal Mach Intell. 2023 Jun;45(6):7270-7292. doi: 10.1109/TPAMI.2022.3218591. Epub 2023 May 5.

Compare the performance of the models in art classification.比较模型在艺术分类方面的性能。

PLoS One. 2021 Mar 12;16(3):e0248414. doi: 10.1371/journal.pone.0248414. eCollection 2021.

Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey.基于深度神经网络的自监督视觉特征学习：综述

IEEE Trans Pattern Anal Mach Intell. 2021 Nov;43(11):4037-4058. doi: 10.1109/TPAMI.2020.2992393. Epub 2021 Oct 1.

Reducing the dimensionality of data with neural networks.使用神经网络降低数据维度。

Science. 2006 Jul 28;313(5786):504-7. doi: 10.1126/science.1127647.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于反向归纳的深度图像搜索。

Backward induction-based deep image search.

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献