Jin Jing, Guo Mantang, Hou Junhui, Liu Hui, Xiong Hongkai
IEEE Trans Pattern Anal Mach Intell. 2023 Oct;45(10):12050-12067. doi: 10.1109/TPAMI.2023.3287603. Epub 2023 Sep 5.
This paper explores the problem of reconstructing high-resolution light field (LF) images from hybrid lenses, including a high-resolution camera surrounded by multiple low-resolution cameras. The performance of existing methods is still limited, as they produce either blurry results on plain textured areas or distortions around depth discontinuous boundaries. To tackle this challenge, we propose a novel end-to-end learning-based approach, which can comprehensively utilize the specific characteristics of the input from two complementary and parallel perspectives. Specifically, one module regresses a spatially consistent intermediate estimation by learning a deep multidimensional and cross-domain feature representation, while the other module warps another intermediate estimation, which maintains the high-frequency textures, by propagating the information of the high-resolution view. We finally leverage the advantages of the two intermediate estimations adaptively via the learned confidence maps, leading to the final high-resolution LF image with satisfactory results on both plain textured areas and depth discontinuous boundaries. Besides, to promote the effectiveness of our method trained with simulated hybrid data on real hybrid data captured by a hybrid LF imaging system, we carefully design the network architecture and the training strategy. Extensive experiments on both real and simulated hybrid data demonstrate the significant superiority of our approach over state-of-the-art ones. To the best of our knowledge, this is the first end-to-end deep learning method for LF reconstruction from a real hybrid input. We believe our framework could potentially decrease the cost of high-resolution LF data acquisition and benefit LF data storage and transmission. The code will be publicly available at https://github.com/jingjin25/LFhybridSR-Fusion.
本文探讨了从混合镜头重建高分辨率光场(LF)图像的问题,该混合镜头包括一个被多个低分辨率相机包围的高分辨率相机。现有方法的性能仍然有限,因为它们在纹理简单的区域会产生模糊结果,或者在深度不连续边界周围产生失真。为应对这一挑战,我们提出了一种新颖的基于端到端学习的方法,该方法可以从两个互补且并行的角度全面利用输入的特定特征。具体而言,一个模块通过学习深度多维跨域特征表示来回归空间一致的中间估计,而另一个模块通过传播高分辨率视图的信息来扭曲另一个保持高频纹理的中间估计。我们最终通过学习到的置信图自适应地利用这两个中间估计的优势,从而得到在纹理简单区域和深度不连续边界上都具有满意结果的最终高分辨率LF图像。此外,为了提高我们用模拟混合数据训练的方法在由混合LF成像系统捕获的真实混合数据上的有效性,我们精心设计了网络架构和训练策略。在真实和模拟混合数据上进行的大量实验表明,我们的方法比现有最先进的方法具有显著优势。据我们所知,这是第一种用于从真实混合输入重建LF的端到端深度学习方法。我们相信我们的框架可能会降低高分辨率LF数据采集的成本,并有利于LF数据的存储和传输。代码将在https://github.com/jingjin25/LFhybridSR-Fusion上公开提供。