Jing Yushi, Baluja Shumeet
Georgia Institute of Technology, Atlanta, USA.
IEEE Trans Pattern Anal Mach Intell. 2008 Nov;30(11):1877-90. doi: 10.1109/TPAMI.2008.121.
Because of the relative ease in understanding and processing text, commercial image-search systems often rely on techniques that are largely indistinguishable from text-search. Recently, academic studies have demonstrated the effectiveness of employing image-based features to provide alternative or additional signals. However, it remains uncertain whether such techniques will generalize to a large number of popular web queries, and whether the potential improvement to search quality warrants the additional computational cost. In this work, we cast the image-ranking problem into the task of identifying "authority" nodes on an inferred visual similarity graph and propose VisualRank to analyze the visual link structures among images. The images found to be "authorities" are chosen as those that answer the image-queries well. To understand the performance of such an approach in a real system, we conducted a series of large-scale experiments based on the task of retrieving images for 2000 of the most popular products queries. Our experimental results show significant improvement, in terms of user satisfaction and relevancy, in comparison to the most recent Google Image Search results. Maintaining modest computational cost is vital to ensuring that this procedure can be used in practice; we describe the techniques required to make this system practical for large scale deployment in commercial search engines.
由于在理解和处理文本方面相对容易,商业图像搜索系统通常依赖于与文本搜索在很大程度上难以区分的技术。最近,学术研究已经证明了采用基于图像的特征来提供替代或额外信号的有效性。然而,这些技术是否能推广到大量流行的网络查询,以及搜索质量的潜在提升是否值得额外的计算成本,仍然不确定。在这项工作中,我们将图像排名问题转化为在推断的视觉相似性图上识别“权威”节点的任务,并提出VisualRank来分析图像之间的视觉链接结构。被发现是“权威”的图像被选为能很好回答图像查询的图像。为了了解这种方法在实际系统中的性能,我们基于为2000个最流行的产品查询检索图像的任务进行了一系列大规模实验。我们的实验结果表明,与最新的谷歌图像搜索结果相比,在用户满意度和相关性方面有显著提升。保持适度的计算成本对于确保该过程能够在实际中使用至关重要;我们描述了使该系统能够在商业搜索引擎中大规模部署所需的技术。