学习为联合图像-文本检索嵌入语义相似度

Learning to Embed Semantic Similarity for Joint Image-Text Retrieval.

作者信息

Malali Noam, Keller Yosi

出版信息

IEEE Trans Pattern Anal Mach Intell. 2022 Dec;44(12):10252-10260. doi: 10.1109/TPAMI.2021.3132163. Epub 2022 Nov 7.

DOI:10.1109/TPAMI.2021.3132163

Abstract

We present a deep learning approach for learning the joint semantic embeddings of images and captions in a euclidean space, such that the semantic similarity is approximated by the L distances in the embedding space. For that, we introduce a metric learning scheme that utilizes multitask learning to learn the embedding of identical semantic concepts using a center loss. By introducing a differentiable quantization scheme into the end-to-end trainable network, we derive a semantic embedding of semantically similar concepts in euclidean space. We also propose a novel metric learning formulation using an adaptive margin hinge loss, that is refined during the training phase. The proposed scheme was applied to the MS-COCO, Flicke30K and Flickr8K datasets, and was shown to compare favorably with contemporary state-of-the-art approaches.

摘要

我们提出了一种深度学习方法，用于在欧几里得空间中学习图像和标题的联合语义嵌入，使得语义相似度可以通过嵌入空间中的L距离来近似。为此，我们引入了一种度量学习方案，该方案利用多任务学习通过中心损失来学习相同语义概念的嵌入。通过将可微量化方案引入到端到端可训练网络中，我们在欧几里得空间中得到了语义相似概念的语义嵌入。我们还提出了一种使用自适应边际铰链损失的新颖度量学习公式，该公式在训练阶段进行了优化。所提出的方案应用于MS-COCO、Flicke30K和Flickr8K数据集，并被证明与当代最先进的方法相比具有优势。

相似文献

Learning to Embed Semantic Similarity for Joint Image-Text Retrieval.学习为联合图像-文本检索嵌入语义相似度

IEEE Trans Pattern Anal Mach Intell. 2022 Dec;44(12):10252-10260. doi: 10.1109/TPAMI.2021.3132163. Epub 2022 Nov 7.

Deep Relation Embedding for Cross-Modal Retrieval.深度关系嵌入的跨模态检索。

IEEE Trans Image Process. 2021;30:617-627. doi: 10.1109/TIP.2020.3038354. Epub 2020 Dec 1.

Image-Text Embedding Learning via Visual and Textual Semantic Reasoning.通过视觉和文本语义推理进行图像-文本嵌入学习

IEEE Trans Pattern Anal Mach Intell. 2023 Jan;45(1):641-656. doi: 10.1109/TPAMI.2022.3148470. Epub 2022 Dec 5.

Introspective Deep Metric Learning.内省深度度量学习

IEEE Trans Pattern Anal Mach Intell. 2024 Apr;46(4):1964-1980. doi: 10.1109/TPAMI.2023.3312311. Epub 2024 Mar 6.

Deep Metric Learning with BIER: Boosting Independent Embeddings Robustly.基于 BIER 的深度度量学习：稳健提升独立嵌入。

IEEE Trans Pattern Anal Mach Intell. 2020 Feb;42(2):276-290. doi: 10.1109/TPAMI.2018.2848925. Epub 2018 Jun 25.

Zero-Shot Image Classification Based on a Learnable Deep Metric.基于可学习深度度量的零样本图像分类

Sensors (Basel). 2021 May 7;21(9):3241. doi: 10.3390/s21093241.

Learning Relationship-Enhanced Semantic Graph for Fine-Grained Image-Text Matching.用于细粒度图像-文本匹配的学习关系增强语义图

IEEE Trans Cybern. 2024 Feb;54(2):948-961. doi: 10.1109/TCYB.2022.3179020. Epub 2024 Jan 17.

Hierarchical Recurrent Neural Hashing for Image Retrieval With Hierarchical Convolutional Features.基于层次卷积特征的层次递归神经网络哈希图像检索

IEEE Trans Image Process. 2018;27(1):106-120. doi: 10.1109/TIP.2017.2755766.

A boosting framework for visuality-preserving distance metric learning and its application to medical image retrieval.一种保持视觉保真度的距离度量学习的提升框架及其在医学图像检索中的应用。

IEEE Trans Pattern Anal Mach Intell. 2010 Jan;32(1):30-44. doi: 10.1109/TPAMI.2008.273.

Topic-Oriented Image Captioning Based on Order-Embedding.基于序嵌入的主题导向图像字幕生成

IEEE Trans Image Process. 2019 Jun;28(6):2743-2754. doi: 10.1109/TIP.2018.2889922. Epub 2018 Dec 27.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

学习为联合图像-文本检索嵌入语义相似度

Learning to Embed Semantic Similarity for Joint Image-Text Retrieval.

作者信息

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献