使用对比多模态图像表示的跨模态子图像检索。

Cross-modality sub-image retrieval using contrastive multimodal image representations.

作者信息

Breznik Eva, Wetzer Elisabeth, Lindblad Joakim, Sladoje Nataša

机构信息

Department of Information Technology, Uppsala University, 751 05, Uppsala, Sweden.

Department of Biomedical Engineering and Health Systems, Royal Institute of Technology, 141 52, Stockholm, Sweden.

出版信息

Sci Rep. 2024 Aug 13;14(1):18798. doi: 10.1038/s41598-024-68800-1.

DOI:10.1038/s41598-024-68800-1

PMID:39138271

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11322435/

Abstract

In tissue characterization and cancer diagnostics, multimodal imaging has emerged as a powerful technique. Thanks to computational advances, large datasets can be exploited to discover patterns in pathologies and improve diagnosis. However, this requires efficient and scalable image retrieval methods. Cross-modality image retrieval is particularly challenging, since images of similar (or even the same) content captured by different modalities might share few common structures. We propose a new application-independent content-based image retrieval (CBIR) system for reverse (sub-)image search across modalities, which combines deep learning to generate representations (embedding the different modalities in a common space) with robust feature extraction and bag-of-words models for efficient and reliable retrieval. We illustrate its advantages through a replacement study, exploring a number of feature extractors and learned representations, as well as through comparison to recent (cross-modality) CBIR methods. For the task of (sub-)image retrieval on a (publicly available) dataset of brightfield and second harmonic generation microscopy images, the results show that our approach is superior to all tested alternatives. We discuss the shortcomings of the compared methods and observe the importance of equivariance and invariance properties of the learned representations and feature extractors in the CBIR pipeline. Code is available at: https://github.com/MIDA-group/CrossModal_ImgRetrieval .

摘要

在组织表征和癌症诊断中，多模态成像已成为一种强大的技术。由于计算技术的进步，可以利用大型数据集来发现病理模式并改善诊断。然而，这需要高效且可扩展的图像检索方法。跨模态图像检索尤其具有挑战性，因为不同模态捕获的相似（甚至相同）内容的图像可能几乎没有共同结构。我们提出了一种新的与应用无关的基于内容的图像检索（CBIR）系统，用于跨模态的反向（子）图像搜索，该系统将深度学习用于生成表示（将不同模态嵌入到一个公共空间中）与强大的特征提取和词袋模型相结合，以实现高效可靠的检索。我们通过一项替换研究展示了它的优势，探索了多种特征提取器和学习到的表示，还与最近的（跨模态）CBIR方法进行了比较。对于在明场和二次谐波产生显微镜图像的（公开可用）数据集上进行（子）图像检索的任务，结果表明我们的方法优于所有测试的替代方法。我们讨论了比较方法的缺点，并观察了在CBIR管道中学习到的表示和特征提取器的等变性和不变性属性的重要性。代码可在以下网址获取：https://github.com/MIDA-group/CrossModal_ImgRetrieval 。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

使用对比多模态图像表示的跨模态子图像检索。

Cross-modality sub-image retrieval using contrastive multimodal image representations.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

使用对比多模态图像表示的跨模态子图像检索。

Cross-modality sub-image retrieval using contrastive multimodal image representations.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献