• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过显式擦除和语义对齐补充实现合成图像检索

Composed Image Retrieval via Explicit Erasure and Replenishment With Semantic Alignment.

作者信息

Zhang Gangjian, Wei Shikui, Pang Huaxin, Qiu Shuang, Zhao Yao

出版信息

IEEE Trans Image Process. 2022;31:5976-5988. doi: 10.1109/TIP.2022.3204213. Epub 2022 Sep 15.

DOI:10.1109/TIP.2022.3204213
PMID:36094980
Abstract

Composed image retrieval aims at retrieving the desired images, given a reference image and a text piece. To handle this task, two important subprocesses should be modeled reasonably. One is to erase irrelated details of the reference image against the text piece, and the other is to replenish the desired details in the image against the text piece. Nowadays, the existing methods neglect to distinguish between the two subprocesses and implicitly put them together to solve the composed image retrieval task. To explicitly and orderly model the two subprocesses of the task, we propose a novel composed image retrieval method which contains three key components, i.e., Multi-semantic Dynamic Suppression module (MDS), Text-semantic Complementary Selection module (TCS), and Semantic Space Alignment constraints (SSA). Concretely, MDS is to erase irrelated details of the reference image by suppressing its semantic features. TCS aims to select and enhance the semantic features of the text piece and then replenish them to the reference image. In the end, to facilitate the erasure and replenishment subprocesses, SSA aligns the semantics of the two modality features in the final space. Extensive experiments on three benchmark datasets (Shoes, FashionIQ, and Fashion200K) show the superior performance of our approach against state-of-the-art methods.

摘要

合成图像检索旨在根据参考图像和一段文本检索出所需图像。为处理此任务,应合理建模两个重要的子过程。一个是根据文本去除参考图像中不相关的细节,另一个是根据文本在图像中补充所需的细节。如今,现有方法忽略区分这两个子过程,而是隐含地将它们放在一起解决合成图像检索任务。为明确且有序地对该任务的两个子过程进行建模,我们提出一种新颖的合成图像检索方法,它包含三个关键组件,即多语义动态抑制模块(MDS)、文本语义互补选择模块(TCS)和语义空间对齐约束(SSA)。具体而言,MDS通过抑制参考图像的语义特征来去除其不相关的细节。TCS旨在选择并增强文本的语义特征,然后将其补充到参考图像中。最后,为便于进行去除和补充子过程,SSA在最终空间中对齐两个模态特征的语义。在三个基准数据集(鞋子、FashionIQ和Fashion200K)上进行的大量实验表明,我们的方法相对于现有方法具有卓越的性能。

相似文献

1
Composed Image Retrieval via Explicit Erasure and Replenishment With Semantic Alignment.通过显式擦除和语义对齐补充实现合成图像检索
IEEE Trans Image Process. 2022;31:5976-5988. doi: 10.1109/TIP.2022.3204213. Epub 2022 Sep 15.
2
Geometry Sensitive Cross-Modal Reasoning for Composed Query Based Image Retrieval.基于组合查询的图像检索的几何敏感跨模态推理
IEEE Trans Image Process. 2022;31:1000-1011. doi: 10.1109/TIP.2021.3138302. Epub 2022 Jan 10.
3
HAAN: Learning a Hierarchical Adaptive Alignment Network for Image-Text Retrieval.学习一种用于图像-文本检索的分层自适应对齐网络。
Sensors (Basel). 2023 Feb 25;23(5):2559. doi: 10.3390/s23052559.
4
Hierarchical matching and reasoning for multi-query image retrieval.多层次匹配与推理的多查询图像检索。
Neural Netw. 2024 May;173:106200. doi: 10.1016/j.neunet.2024.106200. Epub 2024 Feb 22.
5
Progressive Cross-Modal Semantic Network for Zero-Shot Sketch-Based Image Retrieval.用于零样本基于草图的图像检索的渐进式跨模态语义网络
IEEE Trans Image Process. 2020 Sep 10;PP. doi: 10.1109/TIP.2020.3020383.
6
Self-Training Boosted Multi-Factor Matching Network for Composed Image Retrieval.
IEEE Trans Pattern Anal Mach Intell. 2024 May;46(5):3665-3678. doi: 10.1109/TPAMI.2023.3346434. Epub 2024 Apr 3.
7
Memorize, Associate and Match: Embedding Enhancement via Fine-Grained Alignment for Image-Text Retrieval.记忆、关联与匹配:通过细粒度对齐进行图像-文本检索的嵌入增强
IEEE Trans Image Process. 2021;30:9193-9207. doi: 10.1109/TIP.2021.3123553. Epub 2021 Nov 10.
8
Image-Specific Information Suppression and Implicit Local Alignment for Text-Based Person Search.基于文本的行人搜索中的图像特定信息抑制与隐式局部对齐
IEEE Trans Neural Netw Learn Syst. 2024 Dec;35(12):17973-17986. doi: 10.1109/TNNLS.2023.3310118. Epub 2024 Dec 2.
9
A hierarchical knowledge-based approach for retrieving similar medical images described with semantic annotations.一种基于分层知识的方法,用于检索用语义注释描述的相似医学图像。
J Biomed Inform. 2014 Jun;49:227-44. doi: 10.1016/j.jbi.2014.02.018. Epub 2014 Mar 12.
10
Relation-Aggregated Cross-Graph Correlation Learning for Fine-Grained Image-Text Retrieval.用于细粒度图像-文本检索的关系聚合跨图相关性学习
IEEE Trans Neural Netw Learn Syst. 2024 Feb;35(2):2194-2207. doi: 10.1109/TNNLS.2022.3188569. Epub 2024 Feb 5.