Fan Jinlong, Zhang Jing, Tao Dacheng
IEEE Trans Image Process. 2023;32:865-877. doi: 10.1109/TIP.2022.3231087. Epub 2023 Jan 23.
Deep learning has demonstrated its power in image rectification by leveraging the representation capacity of deep neural networks via supervised training based on a large-scale synthetic dataset. However, the model may overfit the synthetic images and generalize not well on real-world fisheye images due to the limited universality of a specific distortion model and the lack of explicitly modeling the distortion and rectification process. In this paper, we propose a novel self-supervised image rectification (SIR) method based on an important insight that the rectified results of distorted images of a same scene from different lenses should be the same. Specifically, we devise a new network architecture with a shared encoder and several prediction heads, each of which predicts the distortion parameter of a specific distortion model. We further leverage a differentiable warping module to generate the rectified images and re-distorted images from the distortion parameters and exploit the intra- and inter-model consistency between them during training, thereby leading to a self-supervised learning scheme without the need for ground-truth distortion parameters or normal images. Experiments on synthetic dataset and real-world fisheye images demonstrate that our method achieves comparable or even better performance than the supervised baseline method and representative state-of-the-art (SOTA) methods. The proposed self-supervised method also provides a possible way to improve the universality of distortion models while keeping their self-consistency. Code and datasets will be available at https://github.com/loong8888/SIR.
深度学习通过基于大规模合成数据集的监督训练,利用深度神经网络的表示能力,在图像校正方面展现出了强大的力量。然而,由于特定失真模型的通用性有限,以及缺乏对失真和校正过程的显式建模,该模型可能会过度拟合合成图像,而在真实世界的鱼眼图像上泛化效果不佳。在本文中,我们基于一个重要的见解提出了一种新颖的自监督图像校正(SIR)方法,即来自不同镜头的同一场景失真图像的校正结果应该是相同的。具体来说,我们设计了一种新的网络架构,它具有一个共享编码器和几个预测头,每个预测头预测特定失真模型的失真参数。我们进一步利用一个可微扭曲模块,根据失真参数生成校正图像和重新失真的图像,并在训练过程中利用它们之间的模型内和模型间一致性,从而得到一种无需真实失真参数或正常图像的自监督学习方案。在合成数据集和真实世界鱼眼图像上的实验表明,我们的方法实现了与监督基线方法和代表性的最新(SOTA)方法相当甚至更好的性能。所提出的自监督方法还提供了一种在保持失真模型自一致性的同时提高其通用性的可能方法。代码和数据集将在https://github.com/loong8888/SIR上提供。