Mao Junhua, Zhu Jun, Yuille Alan L
Comput Vis ECCV. 2014 Sep 6;8691:140-155. doi: 10.1007/978-3-319-10578-9_10.
This paper addresses the task of natural texture and appearance classification. Our goal is to develop a simple and intuitive method that performs at state of the art on datasets ranging from homogeneous texture (e.g., material texture), to less homogeneous texture (e.g., the fur of animals), and to inhomogeneous texture (the appearance patterns of vehicles). Our method uses a bag-of-words model where the features are based on a dictionary of active patches. Active patches are raw intensity patches which can undergo spatial transformations (e.g., rotation and scaling) and adjust themselves to best match the image regions. The dictionary of active patches is required to be compact and representative, in the sense that we can use it to approximately reconstruct the images that we want to classify. We propose a probabilistic model to quantify the quality of image reconstruction and design a greedy learning algorithm to obtain the dictionary. We classify images using the occurrence frequency of the active patches. Feature extraction is fast (about 100 ms per image) using the GPU. The experimental results show that our method improves the state of the art on a challenging material texture benchmark dataset (KTH-TIPS2). To test our method on less homogeneous or inhomogeneous images, we construct two new datasets consisting of appearance image patches of animals and vehicles cropped from the PASCAL VOC dataset. Our method outperforms competing methods on these datasets.
本文探讨自然纹理和外观分类任务。我们的目标是开发一种简单直观的方法,该方法在从均匀纹理(如材料纹理)到不太均匀纹理(如动物皮毛)再到不均匀纹理(车辆外观图案)的数据集上达到当前最优水平。我们的方法使用词袋模型,其中特征基于活动补丁字典。活动补丁是原始强度补丁,可进行空间变换(如旋转和缩放)并自我调整以最佳匹配图像区域。活动补丁字典需要紧凑且具有代表性,即我们可以用它来近似重建我们想要分类的图像。我们提出一种概率模型来量化图像重建质量,并设计一种贪婪学习算法来获取字典。我们使用活动补丁的出现频率对图像进行分类。使用GPU时,特征提取速度很快(每张图像约100毫秒)。实验结果表明,我们的方法在具有挑战性的材料纹理基准数据集(KTH - TIPS2)上改进了当前最优水平。为了在不太均匀或不均匀的图像上测试我们的方法,我们从PASCAL VOC数据集中裁剪出动物和车辆外观图像补丁,构建了两个新数据集。我们的方法在这些数据集上优于竞争方法。