IEEE Trans Image Process. 2015 Jul;24(7):2182-96. doi: 10.1109/TIP.2015.2416654. Epub 2015 Mar 25.
This paper presents a unified variational formulation for joint object segmentation and stereo matching, which takes both accuracy and efficiency into account. In our approach, depth-map consists of compact objects, each object is represented through three different aspects: 1) the perimeter in image space; 2) the slanted object depth plane; and 3) the planar bias, which is to add an additional level of detail on top of each object plane in order to model depth variations within an object. Compared with traditional high quality solving methods in low level, we use a convex formulation of the multilabel Potts Model with PatchMatch stereo techniques to generate depth-map at each image in object level and show that accurate multiple view reconstruction can be achieved with our formulation by means of induced homography without discretization or staircasing artifacts. Our model is formulated as an energy minimization that is optimized via a fast primal-dual algorithm, which can handle several hundred object depth segments efficiently. Performance evaluations in the Middlebury benchmark data sets show that our method outperforms the traditional integer-valued disparity strategy as well as the original PatchMatch algorithm and its variants in subpixel accurate disparity estimation. The proposed algorithm is also evaluated and shown to produce consistently good results for various real-world data sets (KITTI benchmark data sets and multiview benchmark data sets).
本文提出了一种联合目标分割和立体匹配的统一变分公式,同时考虑了准确性和效率。在我们的方法中,深度图由紧凑的物体组成,每个物体通过三个不同的方面表示:1)图像空间中的周长;2)倾斜物体的深度平面;3)平面偏差,即在每个物体平面上添加额外的细节层,以模拟物体内部的深度变化。与传统的低水平高质量求解方法相比,我们使用具有 PatchMatch 立体技术的多标签 Potts 模型的凸公式在对象级别上为每个图像生成深度图,并通过诱导单应性展示了我们的公式可以实现准确的多视图重建,而无需离散化或阶梯化 artifacts。我们的模型被公式化为能量最小化问题,通过快速的原始对偶算法进行优化,可以有效地处理几百个物体深度段。在 Middlebury 基准数据集上的性能评估表明,我们的方法在亚像素精确视差估计方面优于传统的整数值视差策略以及原始的 PatchMatch 算法及其变体。该算法还经过评估,在各种真实世界数据集(KITTI 基准数据集和多视图基准数据集)中都能产生一致的良好结果。