IEEE Trans Image Process. 2014 Jul;23(7):3138-51. doi: 10.1109/TIP.2014.2326413.
Transmitting compactly represented geometry of a dynamic 3D scene from a sender can enable a multitude of imaging functionalities at a receiver, such as synthesis of virtual images at freely chosen viewpoints via depth-image-based rendering. While depth maps—projections of 3D geometry onto 2D image planes at chosen camera viewpoints-can nowadays be readily captured by inexpensive depth sensors, they are often corrupted by non-negligible acquisition noise. Given depth maps need to be denoised and compressed at the encoder for efficient network transmission to the decoder, in this paper, we consider the denoising and compression problems jointly, arguing that doing so will result in a better overall performance than the alternative of solving the two problems separately in two stages. Specifically, we formulate a rate-constrained estimation problem, where given a set of observed noise-corrupted depth maps, the most probable (maximum a posteriori (MAP)) 3D surface is sought within a search space of surfaces with representation size no larger than a prespecified rate constraint. Our rate-constrained MAP solution reduces to the conventional unconstrained MAP 3D surface reconstruction solution if the rate constraint is loose. To solve our posed rate-constrained estimation problem, we propose an iterative algorithm, where in each iteration the structure (object boundaries) and the texture (surfaces within the object boundaries) of the depth maps are optimized alternately. Using the MVC codec for compression of multiview depth video and MPEG free viewpoint video sequences as input, experimental results show that rate-constrained estimated 3D surfaces computed by our algorithm can reduce coding rate of depth maps by up to 32% compared with unconstrained estimated surfaces for the same quality of synthesized virtual views at the decoder.
从发送方传输动态 3D 场景的紧凑表示的几何形状可以在接收方实现多种成像功能,例如通过基于深度图像的渲染在自由选择的视点处合成虚拟图像。虽然深度图——3D 几何在所选相机视点的 2D 图像平面上的投影——如今可以通过廉价的深度传感器轻松捕获,但它们通常会受到不可忽略的采集噪声的干扰。由于深度图需要在编码器处进行去噪和压缩,以便在网络上将其高效传输到解码器,因此在本文中,我们联合考虑了去噪和压缩问题,认为这样做将比在两个阶段分别解决两个问题的替代方案产生更好的整体性能。具体来说,我们提出了一个受速率限制的估计问题,其中给定一组观察到的噪声污染的深度图,在表示大小不超过预设速率限制的表面搜索空间内,寻求最可能的(最大后验 (MAP))3D 表面。如果速率限制宽松,我们的受速率限制的 MAP 解决方案将简化为传统的无约束 MAP 3D 表面重建解决方案。为了解决我们提出的受速率限制的估计问题,我们提出了一种迭代算法,其中在每次迭代中,交替优化深度图的结构(对象边界)和纹理(对象边界内的表面)。使用 MVC 编解码器对多视图深度视频和 MPEG 自由视点视频序列进行压缩作为输入,实验结果表明,与无约束估计表面相比,我们的算法计算的受速率限制的估计 3D 表面可以将深度图的编码率降低多达 32%,而在解码器处的合成虚拟视图的质量相同。