Ahmad Waqas, Vagharshakyan Suren, Sjostrom Marten, Gotchev Atanas, Bregovic Robert, Olsson Roger
IEEE Trans Image Process. 2020 Jan 29. doi: 10.1109/TIP.2020.2969087.
Light field (LF) acquisition devices capture spatial and angular information of a scene. In contrast with traditional cameras, the additional angular information enables novel postprocessing applications, such as 3D scene reconstruction, the ability to refocus at different depth planes, and synthetic aperture. In this paper, we present a novel compression scheme for LF data captured using multiple traditional cameras. The input LF views were divided into two groups: key views and decimated views. The key views were compressed using the multi-view extension of high-efficiency video coding (MV-HEVC) scheme, and decimated views were predicted using the shearlet-transform-based prediction (STBP) scheme. Additionally, the residual information of predicted views was also encoded and sent along with the coded stream of key views. The proposed scheme was evaluated over a benchmark multi-camera based LF datasets, demonstrating that incorporating the residual information into the compression scheme increased the overall peak signal to noise ratio (PSNR) by 2 dB. The proposed compression scheme performed significantly better at low bit rates compared to anchor schemes, which have a better level of compression efficiency in high bit-rate scenarios. The sensitivity of the human vision system towards compression artifacts, specifically at low bit rates, favors the proposed compression scheme over anchor schemes.
光场(LF)采集设备能够捕捉场景的空间和角度信息。与传统相机不同,额外的角度信息使得诸如三维场景重建、在不同深度平面重新聚焦的能力以及合成孔径等新型后处理应用成为可能。在本文中,我们提出了一种针对使用多个传统相机捕获的LF数据的新型压缩方案。输入的LF视图被分为两组:关键视图和抽取视图。关键视图使用高效视频编码(MV-HEVC)方案的多视图扩展进行压缩,抽取视图使用基于剪切波变换的预测(STBP)方案进行预测。此外,预测视图的残差信息也被编码并与关键视图的编码流一起发送出去。该方案通过一个基于多相机的LF基准数据集进行了评估,结果表明将残差信息纳入压缩方案可使整体峰值信噪比(PSNR)提高2dB。与在高比特率场景下具有更好压缩效率水平但被视作基准的方案相比,所提出的压缩方案在低比特率下表现显著更好。人类视觉系统对压缩伪像的敏感度,特别是在低比特率下,使得所提出方案相较于基准方案更具优势。