利用视觉和文本特征进行大规模近重复名人网络图像检索

Large scale near-duplicate celebrity web images retrieval using visual and textual features.

作者信息

Qiao Fengcai, Wang Cheng, Zhang Xin, Wang Hui

机构信息

College of Information Systems and Management, National University of Defense Technology, Changsha 410073, China.

出版信息

ScientificWorldJournal. 2013 Sep 14;2013:795408. doi: 10.1155/2013/795408. eCollection 2013.

DOI:10.1155/2013/795408

PMID:24163631

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3791809/

Abstract

Near-duplicate image retrieval is a classical research problem in computer vision toward many applications such as image annotation and content-based image retrieval. On the web, near-duplication is more prevalent in queries for celebrities and historical figures which are of particular interest to the end users. Existing methods such as bag-of-visual-words (BoVW) solve this problem mainly by exploiting purely visual features. To overcome this limitation, this paper proposes a novel text-based data-driven reranking framework, which utilizes textual features and is combined with state-of-art BoVW schemes. Under this framework, the input of the retrieval procedure is still only a query image. To verify the proposed approach, a dataset of 2 million images of 1089 different celebrities together with their accompanying texts is constructed. In addition, we comprehensively analyze the different categories of near duplication observed in our constructed dataset. Experimental results on this dataset show that the proposed framework can achieve higher mean average precision (mAP) with an improvement of 21% on average in comparison with the approaches based only on visual features, while does not notably prolong the retrieval time.

摘要

近似重复图像检索是计算机视觉领域中一个经典的研究问题，适用于许多应用场景，如图像标注和基于内容的图像检索。在网络上，近似重复现象在针对名人及历史人物的查询中更为普遍，而这些正是终端用户特别感兴趣的内容。现有的方法，如视觉词袋模型（BoVW），主要通过纯粹利用视觉特征来解决这个问题。为了克服这一局限性，本文提出了一种新颖的基于文本的数据驱动重排框架，该框架利用文本特征，并与最先进的BoVW方案相结合。在这个框架下，检索过程的输入仍然只是一张查询图像。为了验证所提出的方法，构建了一个包含1089位不同名人的200万张图像及其相关文本的数据集。此外，我们全面分析了在我们构建的数据集中观察到的不同类型的近似重复情况。在这个数据集上的实验结果表明，与仅基于视觉特征的方法相比，所提出的框架能够实现更高的平均精度均值（mAP），平均提高了21%，同时不会显著延长检索时间。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d3e/3791809/a456bf6c8919/TSWJ2013-795408.001.jpg

相似文献

Large scale near-duplicate celebrity web images retrieval using visual and textual features.

ScientificWorldJournal. 2013 Sep 14;2013:795408. doi: 10.1155/2013/795408. eCollection 2013.

Improving Web image search by bag-based reranking.

IEEE Trans Image Process. 2011 Nov;20(11):3280-90. doi: 10.1109/TIP.2011.2159227. Epub 2011 Jun 9.

Content-based retrieval of historical Ottoman documents stored as textual images.

IEEE Trans Image Process. 2004 Mar;13(3):314-25. doi: 10.1109/tip.2003.821114.

Annotating images by mining image search results.

IEEE Trans Pattern Anal Mach Intell. 2008 Nov;30(11):1919-32. doi: 10.1109/TPAMI.2008.127.

Content Based Image Retrieval by Using Color Descriptor and Discrete Wavelet Transform.

J Med Syst. 2018 Jan 25;42(3):44. doi: 10.1007/s10916-017-0880-7.

A unified framework for image retrieval using keyword and visual features.

IEEE Trans Image Process. 2005 Jul;14(7):979-89. doi: 10.1109/tip.2005.847289.

Task-dependent visual-codebook compression.

IEEE Trans Image Process. 2012 Apr;21(4):2282-93. doi: 10.1109/TIP.2011.2176950. Epub 2011 Nov 22.

A discriminative kernel-based approach to rank images from text queries.

IEEE Trans Pattern Anal Mach Intell. 2008 Aug;30(8):1371-84. doi: 10.1109/TPAMI.2007.70791.

Recognition of pornographic web pages by classifying texts and images.

IEEE Trans Pattern Anal Mach Intell. 2007 Jun;29(6):1019-34. doi: 10.1109/TPAMI.2007.1133.

A memory learning framework for effective image retrieval.

IEEE Trans Image Process. 2005 Apr;14(4):511-24. doi: 10.1109/tip.2004.841205.

引用本文的文献

flowSim: Near duplicate detection for flow cytometry data.

Cytometry A. 2023 Nov;103(11):889-901. doi: 10.1002/cyto.a.24776. Epub 2023 Aug 29.

An Overview of Image Caption Generation Methods.

Comput Intell Neurosci. 2020 Jan 9;2020:3062706. doi: 10.1155/2020/3062706. eCollection 2020.

Completed local ternary pattern for rotation invariant texture classification.

ScientificWorldJournal. 2014;2014:373254. doi: 10.1155/2014/373254. Epub 2014 Apr 7.

本文引用的文献

Enhanced perceptual distance functions and indexing for image replica recognition.

IEEE Trans Pattern Anal Mach Intell. 2005 Mar;27(3):379-91. doi: 10.1109/tpami.2005.54.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用视觉和文本特征进行大规模近重复名人网络图像检索

Large scale near-duplicate celebrity web images retrieval using visual and textual features.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献