IEEE Trans Image Process. 2018 Mar;27(3):1190-1201. doi: 10.1109/TIP.2017.2772858.
High quality virtual views need to be synthesized from adjacent available views for free viewpoint video and multiview video coding (MVC) to provide users with a more realistic 3D viewing experience of a scene. View synthesis techniques suffer from poor rendering quality due to holes created by occlusion and rounding integer error through warping. To remove the holes in the virtual view, the existing techniques use spatial and temporal correlation in intra/inter-view images and depth maps. However, they still suffer quality degradation in the boundary region of foreground and background areas due to the low spatial correlation in texture images and low correspondence in inter-view depth maps. To overcome the above-mentioned limitations, we use a number of models in the Gaussian mixture modeling (GMM) to separate background and foreground pixels in our proposed technique. Here, the missing pixels introduced from the warping process are recovered by the adaptive weighted average of the pixel intensities from the corresponding GMM model(s) and warped image. The weights vary with time to accommodate the changes due to a dynamic background and the motions of the moving objects for view synthesis. We also introduce an adaptive strategy to reset the GMM modeling if the contributions of the pixel intensities drop significantly. Our experimental results indicate that the proposed approach provides 5.40-6.60-dB PSNR improvement compared with the relevant methods. To verify the effectiveness of the proposed view synthesis technique, we use it as an extra reference frame in the motion estimation for MVC. The experimental results confirm that the proposed view synthesis is able to improve PSNR by 3.15-5.13 dB compared with the conventional three reference frames.
为了提供场景的更真实的 3D 观看体验,自由视点视频和多视点视频编码(MVC)需要从相邻的可用视图中合成高质量的虚拟视图。由于遮挡和整数舍入误差导致的变形而产生的空洞,视图合成技术的渲染质量较差。为了消除虚拟视图中的空洞,现有的技术在帧内/帧间图像和深度图中使用空间和时间相关性。然而,由于纹理图像中的空间相关性较低以及帧间深度图中的对应性较低,它们在前景和背景区域的边界区域仍然存在质量下降的问题。为了克服上述限制,我们在提出的技术中使用了一些高斯混合建模(GMM)模型来分离背景和前景像素。在这里,通过从对应 GMM 模型(多个模型)和变形图像中自适应地对像素强度进行加权平均来恢复变形过程中引入的缺失像素。权重随时间变化,以适应由于动态背景和运动物体的运动而导致的变化,以进行视图合成。我们还引入了一种自适应策略,如果像素强度的贡献显著下降,则重置 GMM 建模。我们的实验结果表明,与相关方法相比,所提出的方法提供了 5.40-6.60dB 的 PSNR 改进。为了验证所提出的视图合成技术的有效性,我们将其用作 MVC 中运动估计的额外参考帧。实验结果证实,与传统的三个参考帧相比,所提出的视图合成能够将 PSNR 提高 3.15-5.13dB。