IEEE Trans Image Process. 2017 Dec;26(12):5908-5921. doi: 10.1109/TIP.2017.2745106. Epub 2017 Aug 25.
We study the problem of fine-grained sketch-based image retrieval. By performing instance-level (rather than category-level) retrieval, it embodies a timely and practical application, particularly with the ubiquitous availability of touchscreens. Three factors contribute to the challenging nature of the problem: 1) free-hand sketches are inherently abstract and iconic, making visual comparisons with photos difficult; 2) sketches and photos are in two different visual domains, i.e., black and white lines versus color pixels; and 3) fine-grained distinctions are especially challenging when executed across domain and abstraction-level. To address these challenges, we propose to bridge the image-sketch gap both at the high level via parts and attributes, as well as at the low level via introducing a new domain alignment method. More specifically, first, we contribute a data set with 304 photos and 912 sketches, where each sketch and image is annotated with its semantic parts and associated part-level attributes. With the help of this data set, second, we investigate how strongly supervised deformable part-based models can be learned that subsequently enable automatic detection of part-level attributes, and provide pose-aligned sketch-image comparisons. To reduce the sketch-image gap when comparing low-level features, third, we also propose a novel method for instance-level domain-alignment that exploits both subspace and instance-level cues to better align the domains. Finally, fourth, these are combined in a matching framework integrating aligned low-level features, mid-level geometric structure, and high-level semantic attributes. Extensive experiments conducted on our new data set demonstrate effectiveness of the proposed method.
我们研究了细粒度基于草图的图像检索问题。通过执行实例级(而不是类别级)检索,它体现了一个及时且实用的应用,特别是在触摸屏无处不在的情况下。该问题具有挑战性,原因有三个:1)徒手草图本质上是抽象的和图像的,使得与照片进行视觉比较变得困难;2)草图和照片处于两个不同的视觉领域,即黑白线条与彩色像素;3)在跨领域和抽象级别执行时,细粒度的区别特别具有挑战性。为了解决这些挑战,我们提出通过高级别的部分和属性以及低级别的引入新的域对齐方法来弥合图像草图之间的差距。更具体地说,首先,我们贡献了一个包含 304 张照片和 912 张草图的数据集,其中每张草图和图像都用其语义部分和相关的部分级属性进行注释。在这个数据集的帮助下,其次,我们研究了如何能够学习到强有力的监督可变形部分基模型,从而能够自动检测部分级属性,并提供姿势对齐的草图图像比较。为了在比较低级特征时缩小草图图像之间的差距,第三,我们还提出了一种新的实例级域对齐方法,该方法利用子空间和实例级线索来更好地对齐域。最后,第四,这些方法在一个匹配框架中结合了对齐的低级特征、中级几何结构和高级语义属性。在我们的新数据集上进行的广泛实验证明了所提出方法的有效性。