School of Electrical and Electronics Engineering, Nanyang Technological University, Singapore.
IEEE Trans Image Process. 2012 Apr;21(4):2207-19. doi: 10.1109/TIP.2011.2181952. Epub 2011 Dec 26.
Given a collection of images or a short video sequence, we define a thematic object as the key object that frequently appears and is the representative of the visual contents. Successful discovery of the thematic object is helpful for object search and tagging, video summarization and understanding, etc. However, this task is challenging because 1) there lacks a priori knowledge of the thematic objects, such as their shapes, scales, locations, and times of re-occurrences, and 2) the thematic object of interest can be under severe variations in appearances due to viewpoint and lighting condition changes, scale variations, etc. Instead of using a top-down generative model to discover thematic visual patterns, we propose a novel bottom-up approach to gradually prune uncommon local visual primitives and recover the thematic objects. A multilayer candidate pruning procedure is designed to accelerate the image data mining process. Our solution can efficiently locate thematic objects of various sizes and can tolerate large appearance variations of the same thematic object. Experiments on challenging image and video data sets and comparisons with existing methods validate the effectiveness of our method.
给定一组图像或一段短视频序列,我们将主题对象定义为频繁出现且代表视觉内容的关键对象。成功发现主题对象有助于对象搜索和标记、视频摘要和理解等。然而,这项任务具有挑战性,原因有二:1)缺乏主题对象的先验知识,例如它们的形状、大小、位置和出现时间;2)由于视点和光照条件变化、大小变化等,感兴趣的主题对象的外观可能会发生严重变化。我们提出了一种新颖的自下而上的方法,而不是使用自上而下的生成模型来发现主题视觉模式,该方法可以逐步修剪不常见的局部视觉基元并恢复主题对象。设计了一个多层候选修剪过程来加速图像数据挖掘过程。我们的解决方案可以高效地定位各种大小的主题对象,并且可以容忍同一主题对象的较大外观变化。在具有挑战性的图像和视频数据集上进行的实验以及与现有方法的比较验证了我们方法的有效性。