Pan Liyuan, Hartley Richard, Liu Liu, Xu Zhiwei, Chowdhury Shah, Yang Yan, Zhang Hongguang, Li Hongdong, Liu Miaomiao
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):11314-11330. doi: 10.1109/TPAMI.2024.3458974. Epub 2024 Nov 6.
Dual-pixel (DP) imaging sensors are getting more popularly adopted by modern cameras. A DP camera captures a pair of images in a single snapshot by splitting each pixel in half. Several previous studies show how to recover depth information by treating the DP pair as an approximate stereo pair. However, dual-pixel disparity occurs only in image regions with defocus blur which is unlike classic stereo disparity. Heavy defocus blur in DP pairs affects the performance of depth estimation approaches based on matching. Therefore, we treat the blur removal and the depth estimation as a joint problem. We investigate the formation of the DP pair, which links the blur and depth information, rather than blindly removing the blur effect. We propose a mathematical DP model that can improve depth estimation by the blur. This exploration motivated us to propose our previous work, an end-to-end DDDNet (DP-based Depth and Deblur Network), which jointly estimates depth and restores the image in a supervised fashion. However, collecting the ground-truth (GT) depth map for the DP pair is challenging and limits the depth estimation potential of the DP sensor. Therefore, we propose an extension of the DDDNet, called WDDNet (Weakly-supervised Depth and Deblur Network), which includes an efficient reblur solver that does not require GT depth maps for training. To achieve this, we convert all-in-focus images into supervisory signals for unsupervised depth estimation in our WDDNet. We jointly estimate an all-in-focus image and a disparity map, then use a Reblur and Fstack module to regularize the disparity estimation and image restoration. We conducted extensive experiments on synthetic and real data to demonstrate the competitive performance of our method when compared to state-of-the-art (SOTA) supervised approaches.
双像素(DP)成像传感器正越来越广泛地被现代相机所采用。DP相机通过将每个像素一分为二,在单次拍摄中捕获一对图像。先前的一些研究展示了如何将DP图像对视为近似立体图像对来恢复深度信息。然而,双像素视差仅出现在具有散焦模糊的图像区域,这与经典立体视差不同。DP图像对中的严重散焦模糊会影响基于匹配的深度估计方法的性能。因此,我们将去模糊和深度估计视为一个联合问题。我们研究DP图像对的形成,它将模糊和深度信息联系起来,而不是盲目地消除模糊效果。我们提出了一个数学DP模型,该模型可以通过模糊来改进深度估计。这一探索促使我们提出了之前的工作,即端到端的DDDNet(基于双像素的深度与去模糊网络),它以有监督的方式联合估计深度并恢复图像。然而,为DP图像对收集真实深度图具有挑战性,并且限制了DP传感器的深度估计潜力。因此,我们提出了DDDNet的扩展版本,称为WDDNet(弱监督深度与去模糊网络),它包括一个高效的再模糊求解器,在训练时不需要真实深度图。为了实现这一点,我们在WDDNet中将全聚焦图像转换为用于无监督深度估计的监督信号。我们联合估计全聚焦图像和视差图,然后使用再模糊和Fstack模块来规范视差估计和图像恢复。我们在合成数据和真实数据上进行了广泛的实验,以证明我们的方法与最新的(SOTA)有监督方法相比具有竞争力的性能。