Sony Research and Development Center Beijing Lab, Chao-Yang District, Beijing, 100027, China.
Sony Research and Development Center Beijing Lab, Chao-Yang District, Beijing, 100027, China.
Neural Netw. 2024 Jul;175:106281. doi: 10.1016/j.neunet.2024.106281. Epub 2024 Mar 28.
Due to distribution shift, deep learning based methods for image dehazing suffer from performance degradation when applied to real-world hazy images. In this paper, this study considers a dehazing framework based on conditional diffusion models for improved generalization to real haze. First, our work finds that optimizing the training objective of diffusion models, i.e., Gaussian noise vectors, is non-trivial. The spectral bias of deep networks hinders the higher frequency modes in Gaussian vectors from being learned and hence impairs the reconstruction of image details. To tackle this issue, this study designs a network unit, named Frequency Compensation block (FCB), with a bank of filters that jointly emphasize the mid-to-high frequencies of an input signal. Our work demonstrates that diffusion models with FCB achieve significant gains in both perceptual and distortion metrics. Second, to further boost the generalization performance, this study proposed a novel data synthesis pipeline, HazeAug, to augment haze in terms of degree and diversity. Within the framework, a solid baseline for blind dehazing is set up where models are trained on synthetic hazy-clean pairs, and directly generalize to real data. Extensive evaluations on real dehazing datasets demonstrate the superior performance of the proposed dehazing diffusion model in distortion metrics. Compared to recent methods pre-trained on large-scale, high-quality image datasets, our model achieves a significant PSNR improvement of over 1 dB on challenging databases such as Dense-Haze and Nh-Haze.
由于分布偏移,应用于真实世界的雾化图像时,基于深度学习的图像去雾化方法的性能会下降。在本文中,我们考虑了一种基于条件扩散模型的去雾化框架,以提高对真实雾化的泛化能力。首先,我们发现优化扩散模型的训练目标,即高斯噪声向量,是具有挑战性的。深度网络的频谱偏差阻碍了高斯向量中更高频模式的学习,从而损害了图像细节的重建。为了解决这个问题,我们设计了一个名为频率补偿块(FCB)的网络单元,它带有一组滤波器,可以共同强调输入信号的中频到高频。我们的工作表明,具有 FCB 的扩散模型在感知和失真指标方面都有显著的提高。其次,为了进一步提高泛化性能,我们提出了一种新的数据合成管道 HazeAug,以增加雾化的程度和多样性。在该框架中,建立了一个盲去雾化的坚实基线,模型在合成的雾化-清晰对上进行训练,并直接推广到真实数据。在真实的去雾化数据集上的广泛评估表明,所提出的去雾化扩散模型在失真指标方面具有优越的性能。与在大规模、高质量图像数据集上进行预训练的最新方法相比,我们的模型在具有挑战性的数据库,如 Dense-Haze 和 Nh-Haze 上,PSNR 提高了 1dB 以上。