Suppr超能文献

大规模网络数据支持的个人照片文本查询。

Textual query of personal photos facilitated by large-scale web data.

机构信息

School of Computer Engineering, Nanyang Technological University, 50 Nanyang Avenue, Blk N4, Singapore 639798.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2011 May;33(5):1022-36. doi: 10.1109/TPAMI.2010.142.

Abstract

The rapid popularization of digital cameras and mobile phone cameras has led to an explosive growth of personal photo collections by consumers. In this paper, we present a real-time textual query-based personal photo retrieval system by leveraging millions of Web images and their associated rich textual descriptions (captions, categories, etc.). After a user provides a textual query (e.g., “water”), our system exploits the inverted file to automatically find the positive Web images that are related to the textual query “water” as well as the negative Web images that are irrelevant to the textual query. Based on these automatically retrieved relevant and irrelevant Web images, we employ three simple but effective classification methods, k-Nearest Neighbor (kNN), decision stumps, and linear SVM, to rank personal photos. To further improve the photo retrieval performance, we propose two relevance feedback methods via cross-domain learning, which effectively utilize both the Web images and personal images. In particular, our proposed crossdomain learning methods can learn robust classifiers with only a very limited amount of labeled personal photos from the user by leveraging the prelearned linear SVM classifiers in real time. We further propose an incremental cross-domain learning method in order to significantly accelerate the relevance feedback process on large consumer photo databases. Extensive experiments on two consumer photo data sets demonstrate the effectiveness and efficiency of our system, which is also inherently not limited by any predefined lexicon.

摘要

数码相机和手机相机的迅速普及,导致消费者个人照片收藏呈爆炸式增长。在本文中,我们提出了一个实时基于文本查询的个人照片检索系统,该系统利用了数以百万计的网络图像及其相关的丰富文本描述(标题、类别等)。用户提供文本查询(例如“water”)后,我们的系统利用倒排文件自动找到与文本查询“water”相关的正例网络图像以及与文本查询不相关的负例网络图像。基于这些自动检索到的相关和不相关的网络图像,我们采用了三种简单而有效的分类方法,k-最近邻(kNN)、决策树桩和线性 SVM,对个人照片进行排序。为了进一步提高照片检索性能,我们通过跨域学习提出了两种相关反馈方法,有效地利用了网络图像和个人图像。特别是,我们提出的跨域学习方法可以通过实时利用预先学习的线性 SVM 分类器,仅使用用户提供的非常有限数量的带标注个人照片,学习到鲁棒的分类器。我们进一步提出了一种增量跨域学习方法,以便在大型消费者照片数据库上显著加速相关反馈过程。在两个消费者照片数据集上的广泛实验表明了我们系统的有效性和效率,而且它不受任何预定义词汇的限制。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验