Institute of Software Technology and Interactive Systems, Vienna University of Technology, Interactive Media Systems Group, Favoritenstrasse 9-11/188/2, A-1040 Vienna, Austria.
IEEE Trans Pattern Anal Mach Intell. 2013 Feb;35(2):504-11. doi: 10.1109/TPAMI.2012.156.
Many computer vision tasks can be formulated as labeling problems. The desired solution is often a spatially smooth labeling where label transitions are aligned with color edges of the input image. We show that such solutions can be efficiently achieved by smoothing the label costs with a very fast edge-preserving filter. In this paper, we propose a generic and simple framework comprising three steps: 1) constructing a cost volume, 2) fast cost volume filtering, and 3) Winner-Takes-All label selection. Our main contribution is to show that with such a simple framework state-of-the-art results can be achieved for several computer vision applications. In particular, we achieve 1) disparity maps in real time whose quality exceeds those of all other fast (local) approaches on the Middlebury stereo benchmark, and 2) optical flow fields which contain very fine structures as well as large displacements. To demonstrate robustness, the few parameters of our framework are set to nearly identical values for both applications. Also, competitive results for interactive image segmentation are presented. With this work, we hope to inspire other researchers to leverage this framework to other application areas.
许多计算机视觉任务都可以被表述为标记问题。期望的解决方案通常是空间平滑的标记,其中标签转换与输入图像的颜色边缘对齐。我们表明,通过使用非常快速的保持边缘滤波器平滑标签成本,可以有效地实现这样的解决方案。在本文中,我们提出了一个通用且简单的框架,包括三个步骤:1)构建成本体积,2)快速成本体积滤波,3)胜者全拿标签选择。我们的主要贡献是表明,通过这样一个简单的框架,可以为几个计算机视觉应用实现最先进的结果。具体来说,我们实现了 1)实时视差图,其质量超过了 Middlebury 立体基准测试中所有其他快速(局部)方法的质量,2)光流场,其中包含非常精细的结构以及大位移。为了证明鲁棒性,我们将框架的几个参数设置为两个应用程序的几乎相同的值。此外,还提出了用于交互式图像分割的竞争结果。通过这项工作,我们希望鼓励其他研究人员将这个框架应用于其他应用领域。