National Key Laboratory of Science and Technology on Multi-Spectral Information Processing, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China.
Sensors (Basel). 2021 Mar 5;21(5):1815. doi: 10.3390/s21051815.
Shallow depth-of-field (DoF), focusing on the region of interest by blurring out the rest of the image, is challenging in computer vision and computational photography. It can be achieved either by adjusting the parameters (e.g., aperture and focal length) of a single-lens reflex camera or computational techniques. In this paper, we investigate the latter one, i.e., explore a computational method to render shallow DoF. The previous methods either rely on portrait segmentation or stereo sensing, which can only be applied to portrait photos and require stereo inputs. To address these issues, we study the problem of rendering shallow DoF from an arbitrary image. In particular, we propose a method that consists of a salient object detection (SOD) module, a monocular depth prediction (MDP) module, and a DoF rendering module. The SOD module determines the focal plane, while the MDP module controls the blur degree. Specifically, we introduce a label-guided ranking loss for both salient object detection and depth prediction. For salient object detection, the label-guided ranking loss comprises two terms: (i) heterogeneous ranking loss that encourages the sampled salient pixels to be different from background pixels; (ii) homogeneous ranking loss penalizes the inconsistency of salient pixels or background pixels. For depth prediction, the label-guided ranking loss mainly relies on multilevel structural information, i.e., from low-level edge maps to high-level object instance masks. In addition, we introduce a SOD and depth-aware blur rendering method to generate shallow DoF images. Comprehensive experiments demonstrate the effectiveness of our proposed method.
浅景深(DoF)通过模糊图像的其余部分来聚焦感兴趣的区域,这在计算机视觉和计算摄影中是一项具有挑战性的任务。它可以通过调整单镜头反光相机的参数(例如光圈和焦距)或计算技术来实现。在本文中,我们研究了后者,即探索一种用于渲染浅景深的计算方法。以前的方法要么依赖于人像分割,要么依赖于立体感应,这只能应用于人像照片,并需要立体输入。为了解决这些问题,我们研究了从任意图像渲染浅景深的问题。具体来说,我们提出了一种由显著目标检测(SOD)模块、单目深度预测(MDP)模块和景深渲染模块组成的方法。SOD 模块确定焦平面,而 MDP 模块控制模糊程度。具体来说,我们为显著目标检测和深度预测引入了标签引导排序损失。对于显著目标检测,标签引导排序损失包括两个术语:(i)异构排序损失,鼓励采样的显著像素与背景像素不同;(ii)同排序损失惩罚显著像素或背景像素的不一致性。对于深度预测,标签引导排序损失主要依赖于多层次的结构信息,即从低水平的边缘图到高水平的对象实例掩码。此外,我们引入了一种 SOD 和深度感知模糊渲染方法来生成浅景深图像。全面的实验证明了我们提出的方法的有效性。