Gaur Utkarsh, Manjunath B S
IEEE Trans Image Process. 2019 Dec 11. doi: 10.1109/TIP.2019.2957937.
Superpixel segmentation is a fundamental computer vision technique that finds application in a multitude of high level computer vision tasks. Most state-of-the-art superpixel segmentation methods are unsupervised in nature and thus cannot fully utilize frequently occurring texture patterns or incorporate multiscale context. In this paper, we show that superpixel segmentation can be improved by leveraging the superior modeling power of deep convolutional autoencoders in a fully unsupervised manner. We pose the superpixel segmentation problem as one of manifold learning where pixels that belong to similar texture patterns are assigned near identical embedding vectors. The proposed deep network is able to learn image-wide and dataset-wide feature patterns and the relationships between them. This knowledge is used to segment and group pixels in a way that is consistent with a more global definition of pattern coherence. Experiments demonstrate that the superpixels obtained from the embeddings learned by the proposed method outperform the state-of-theart superpixel segmentation methods for boundary precision and recall values. Additionally, we find that semantic edges obtained from the superpixel embeddings to be significantly better than the contemporary unsupervised approaches.
超像素分割是一种基础的计算机视觉技术,在众多高级计算机视觉任务中都有应用。大多数当前最先进的超像素分割方法本质上是无监督的,因此无法充分利用频繁出现的纹理模式,也无法纳入多尺度上下文信息。在本文中,我们表明可以通过以完全无监督的方式利用深度卷积自动编码器的强大建模能力来改进超像素分割。我们将超像素分割问题视为流形学习问题之一,即属于相似纹理模式的像素被分配几乎相同的嵌入向量。所提出的深度网络能够学习图像范围和数据集范围的特征模式以及它们之间的关系。这些知识被用于以一种与模式连贯性的更全局定义相一致的方式对像素进行分割和分组。实验表明,从所提出的方法学习到的嵌入中获得的超像素在边界精度和召回值方面优于当前最先进的超像素分割方法。此外,我们发现从超像素嵌入中获得的语义边缘明显优于当代无监督方法。