Yang Hua, Huang Chenghui, Wang Feiyue, Song Kaiyou, Yin Zhouping
IEEE Trans Image Process. 2019 Jan 17. doi: 10.1109/TIP.2019.2893743.
Almost all conventional template-matching methods employ low-level image features to measure the similarity between a template image and a scene image using similarity measures such as pixel intensity and pixel gradient. Although these methods have been widely used in many applications, they cannot simultaneously address all types of robustness challenges. In this study, with the goal of simultaneously addressing the various challenges, we present a robust semantic template-matching approach (RSTM). Inspired by the local binary descriptor, we propose a novel superpixel region binary descriptor (SRBD) to construct a multilevel semantic fusion feature vector for RSTM. SRBD uses a new kernel-distance-based simple linear iterative clustering (KD-SLIC) method to extract the stable superpixels from the template image; Then, based on the average intensity difference between each superpixel region and its neighbors, the dominant gradient orientation of each superpixel can be obtained, and the semantic features of each superpixel can be described as the dominant orientation difference vector, which is coded as the rotation-invariant SRBD. In the off-line matching phase, the fusion semantic feature vector of RSTM combines the multilevel SRBD features with different numbers of superpixels. In the online matching phase, to cope with rotation invariance, a marginal probability model is proposed and applied to locate the positions of template images in the scene image. Moreover, to accelerate computation, an image pyramid is employed. We conduct a series of experiments on a large dataset randomly selected from the MS COCO dataset to fully analyze the robustness of this approach. The experimental results show that RSTM simultaneously addresses rotation changes, scale changes, noise, occlusions, blur, nonlinear illumination changes and deformation with high time efficiency while also outperforming previous stateof- the-art template-matching methods.
几乎所有传统的模板匹配方法都采用低级图像特征,通过诸如像素强度和像素梯度等相似性度量来衡量模板图像与场景图像之间的相似度。尽管这些方法已在许多应用中广泛使用,但它们无法同时应对所有类型的鲁棒性挑战。在本研究中,为了同时应对各种挑战,我们提出了一种鲁棒的语义模板匹配方法(RSTM)。受局部二值描述符的启发,我们提出了一种新颖的超像素区域二值描述符(SRBD),为RSTM构建多级语义融合特征向量。SRBD使用一种基于核距离的新的简单线性迭代聚类(KD-SLIC)方法从模板图像中提取稳定的超像素;然后,基于每个超像素区域与其相邻区域之间的平均强度差异,可以获得每个超像素的主导梯度方向,并且每个超像素的语义特征可以描述为主导方向差异向量,将其编码为旋转不变的SRBD。在离线匹配阶段,RSTM的融合语义特征向量将不同数量超像素的多级SRBD特征进行组合。在在线匹配阶段,为了应对旋转不变性,提出并应用了一种边际概率模型来定位场景图像中模板图像的位置。此外,为了加速计算,采用了图像金字塔。我们在从MS COCO数据集中随机选择的一个大型数据集上进行了一系列实验,以全面分析该方法的鲁棒性。实验结果表明,RSTM以高时间效率同时应对旋转变化、尺度变化、噪声、遮挡、模糊、非线性光照变化和变形,并且还优于先前的最先进模板匹配方法。