• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于无监督跨模态检索的对象级视觉-文本关联图哈希

Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval.

作者信息

Shi Ge, Li Feng, Wu Lifang, Chen Yukun

机构信息

Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.

出版信息

Sensors (Basel). 2022 Apr 11;22(8):2921. doi: 10.3390/s22082921.

DOI:10.3390/s22082921
PMID:35458906
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9029824/
Abstract

The core of cross-modal hashing methods is to map high dimensional features into binary hash codes, which can then efficiently utilize the Hamming distance metric to enhance retrieval efficiency. Recent development emphasizes the advantages of the unsupervised cross-modal hashing technique, since it only relies on relevant information of the paired data, making it more applicable to real-world applications. However, two problems, that is intro-modality correlation and inter-modality correlation, still have not been fully considered. Intra-modality correlation describes the complex overall concept of a single modality and provides semantic relevance for retrieval tasks, while inter-modality correction refers to the relationship between different modalities. From our observation and hypothesis, the dependency relationship within the modality and between different modalities can be constructed at the object level, which can further improve cross-modal hashing retrieval accuracy. To this end, we propose a Visual-textful Correlation Graph Hashing (OVCGH) approach to mine the fine-grained object-level similarity in cross-modal data while suppressing noise interference. Specifically, a novel intra-modality correlation graph is designed to learn graph-level representations of different modalities, obtaining the dependency relationship of the image region to image region and the tag to tag in an unsupervised manner. Then, we design a visual-text dependency building module that can capture correlation semantic information between different modalities by modeling the dependency relationship between image object region and text tag. Extensive experiments on two widely used datasets verify the effectiveness of our proposed approach.

摘要

跨模态哈希方法的核心是将高维特征映射为二进制哈希码,进而能够高效利用汉明距离度量来提高检索效率。近期的发展强调了无监督跨模态哈希技术的优势,因为它仅依赖于配对数据的相关信息,使其更适用于实际应用。然而,两个问题,即模态内相关性和模态间相关性仍未得到充分考虑。模态内相关性描述了单一模态的复杂整体概念,并为检索任务提供语义相关性,而模态间相关性则指不同模态之间的关系。根据我们的观察和假设,可以在对象层面构建模态内和不同模态之间的依赖关系,这能够进一步提高跨模态哈希检索的准确性。为此,我们提出了一种视觉文本相关图哈希(OVCGH)方法,以挖掘跨模态数据中细粒度的对象级相似性,同时抑制噪声干扰。具体而言,设计了一种新颖的模态内相关图,用于学习不同模态的图级表示,以无监督方式获得图像区域与图像区域之间以及标签与标签之间的依赖关系。然后,我们设计了一个视觉文本依赖构建模块,通过对图像对象区域和文本标签之间的依赖关系进行建模,来捕获不同模态之间的相关语义信息。在两个广泛使用的数据集上进行的大量实验验证了我们所提方法的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/26c9/9029824/cc9a7d4c9d11/sensors-22-02921-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/26c9/9029824/9c462fda4197/sensors-22-02921-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/26c9/9029824/211c5f2b493c/sensors-22-02921-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/26c9/9029824/cc9a7d4c9d11/sensors-22-02921-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/26c9/9029824/9c462fda4197/sensors-22-02921-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/26c9/9029824/211c5f2b493c/sensors-22-02921-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/26c9/9029824/cc9a7d4c9d11/sensors-22-02921-g003.jpg

相似文献

1
Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval.用于无监督跨模态检索的对象级视觉-文本关联图哈希
Sensors (Basel). 2022 Apr 11;22(8):2921. doi: 10.3390/s22082921.
2
CLIP-Based Adaptive Graph Attention Network for Large-Scale Unsupervised Multi-Modal Hashing Retrieval.基于 CLIP 的自适应图注意力网络的大规模无监督多模态哈希检索
Sensors (Basel). 2023 Mar 24;23(7):3439. doi: 10.3390/s23073439.
3
Structure-aware contrastive hashing for unsupervised cross-modal retrieval.用于无监督跨模态检索的结构感知对比哈希
Neural Netw. 2024 Jun;174:106211. doi: 10.1016/j.neunet.2024.106211. Epub 2024 Feb 27.
4
Triplet-Based Deep Hashing Network for Cross-Modal Retrieval.用于跨模态检索的基于三元组的深度哈希网络。
IEEE Trans Image Process. 2018 Aug;27(8):3893-3903. doi: 10.1109/TIP.2018.2821921. Epub 2018 Apr 4.
5
Semantic Neighbor Graph Hashing for Multimodal Retrieval.基于语义邻居图的哈希的多模态检索。
IEEE Trans Image Process. 2018 Mar;27(3):1405-1417. doi: 10.1109/TIP.2017.2776745. Epub 2017 Nov 22.
6
Multi-Relational Deep Hashing for Cross-Modal Search.用于跨模态搜索的多关系深度哈希
IEEE Trans Image Process. 2024;33:3009-3020. doi: 10.1109/TIP.2024.3385656. Epub 2024 Apr 25.
7
Deep Semantic Multimodal Hashing Network for Scalable Image-Text and Video-Text Retrievals.用于可扩展图像-文本和视频-文本检索的深度语义多模态哈希网络
IEEE Trans Neural Netw Learn Syst. 2023 Apr;34(4):1838-1851. doi: 10.1109/TNNLS.2020.2997020. Epub 2023 Apr 4.
8
Hierarchical semantic interaction-based deep hashing network for cross-modal retrieval.基于层次语义交互的深度哈希网络用于跨模态检索。
PeerJ Comput Sci. 2021 May 25;7:e552. doi: 10.7717/peerj-cs.552. eCollection 2021.
9
MTFH: A Matrix Tri-Factorization Hashing Framework for Efficient Cross-Modal Retrieval.MTFH:一种用于高效跨模态检索的矩阵三因素分解哈希框架。
IEEE Trans Pattern Anal Mach Intell. 2021 Mar;43(3):964-981. doi: 10.1109/TPAMI.2019.2940446. Epub 2021 Feb 4.
10
Deep Unsupervised Hashing for Large-Scale Cross-Modal Retrieval Using Knowledge Distillation Model.基于知识蒸馏模型的大规模跨模态检索的深度无监督哈希。
Comput Intell Neurosci. 2021 Jul 17;2021:5107034. doi: 10.1155/2021/5107034. eCollection 2021.

引用本文的文献

1
Cross-Modal Contrastive Hashing Retrieval for Infrared Video and EEG.基于跨模态对比散列的红外视频与 EEG 检索。
Sensors (Basel). 2022 Nov 14;22(22):8804. doi: 10.3390/s22228804.

本文引用的文献

1
Triplet-Based Deep Hashing Network for Cross-Modal Retrieval.用于跨模态检索的基于三元组的深度哈希网络。
IEEE Trans Image Process. 2018 Aug;27(8):3893-3903. doi: 10.1109/TIP.2018.2821921. Epub 2018 Apr 4.
2
Large-Scale Cross-Modality Search via Collective Matrix Factorization Hashing.基于集体矩阵分解哈希的大规模跨模态搜索
IEEE Trans Image Process. 2016 Nov;25(11):5427-5440. doi: 10.1109/TIP.2016.2607421. Epub 2016 Sep 8.
3
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.更快的 R-CNN:基于区域建议网络的实时目标检测。
IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.