IEEE Trans Image Process. 2017 Jun;26(6):2644-2655. doi: 10.1109/TIP.2017.2685340. Epub 2017 Mar 21.
Graph-based representation (GBR) has recently been proposed for describing color and geometry of multiview video content. The graph vertices represent the color information, while the edges represent the geometry information, i.e., the disparity, by connecting corresponding pixels in two camera views. In this paper, we generalize the GBR to multiview images with complex camera configurations. Compared with the existing GBR, the proposed representation can handle not only horizontal displacements of the cameras but also forward/backward translations, rotations, etc. However, contrary to the usual disparity that is a 2-D vector (denoting horizontal and vertical displacements), each edge in GBR is represented by a 1-D disparity. This quantity can be seen as the disparity along an epipolar segment. In order to have a sparse (i.e., easy to code) graph structure, we propose a rate-distortion model to select the most meaningful edges. Hence the graph is constructed with "just enough" information for rendering the given predicted view. The experiments show that the proposed GBR allows high reconstruction quality with lower or equivalent coding rate than traditional depth-based representations.
基于图的表示(GBR)最近被提出用于描述多视角视频内容的颜色和几何信息。图的顶点表示颜色信息,而边则表示几何信息,即通过连接两个相机视图中的对应像素来表示视差。在本文中,我们将 GBR 推广到具有复杂相机配置的多视角图像。与现有的 GBR 相比,所提出的表示不仅可以处理相机的水平位移,还可以处理前后平移、旋转等。然而,与通常的视差(表示水平和垂直位移的二维向量)不同,GBR 中的每条边都由一维视差表示。这个数量可以被看作是在一个极线上的视差。为了使图结构稀疏(即易于编码),我们提出了一个率失真模型来选择最有意义的边。因此,该图是用“足够”的信息构建的,用于渲染给定的预测视图。实验表明,与传统的基于深度的表示相比,所提出的 GBR 允许更高的重建质量,并且具有更低或等效的编码率。