Zhang Ya-Nan, Li Qiufu, Wu Xu, Mu Nan, Li Xiaoning, Shen Linlin
IEEE Trans Image Process. 2025;34:4040-4051. doi: 10.1109/TIP.2025.3581418.
Removing unwanted reflections from images is a fundamental yet challenging problem in low-level computer vision. Recent deep learning-based Single Image Reflection Removal (SIRR) methods have made significant progress. However, separating reflections from transmission content remains difficult, particularly in complex scenes where the two exhibit high visual similarity. Upon careful analysis, we find that reflections predominantly reside in the high-frequency components of an image. These reflections tend to distort fine details in the high-frequency range, while the low-frequency information remains relatively less affected. This observation motivates us to explore a frequency-aware approach for SIRR by leveraging the Discrete Wavelet Transform (DWT). The wavelet decomposition enables us to distinguish and isolate reflective artifacts in the frequency domain while preserving the transmission information. Building on this insight, we propose a novel Wavelet-guided Deep Unfolding Network (WDUNet) that leverages the strengths of wavelet decomposition and deep unfolding techniques to improve interpretability and generalization in SIRR. Specifically, we formulate an optimization-based reflection removal model using DWT and convolutional dictionaries. The proposed model is optimized via a proximal gradient algorithm and then unfolded into a neural network architecture, where all parameters are learned end-to-end during training. By combining wavelet domain analysis with deep unfolding, WDUNet enhances both the interpretability and generalization of SIRR methods. Additionally, we design and integrate the Low-frequency Parameter Estimation Module (LPEM) and High-frequency Parameter Estimation Module (HPEM) modules into WDUNet, allowing the network to automatically learn and optimize the models' hyperparameters. Extensive experiments conducted on four benchmark datasets demonstrate that WDUNet consistently outperforms existing state-of-the-art methods in both objective evaluation metrics and subjective visual quality.
去除图像中不需要的反射是低层次计算机视觉中的一个基本但具有挑战性的问题。最近基于深度学习的单图像反射去除(SIRR)方法取得了显著进展。然而,将反射与透射内容分离仍然很困难,特别是在两者呈现出高度视觉相似性的复杂场景中。经过仔细分析,我们发现反射主要存在于图像的高频分量中。这些反射往往会扭曲高频范围内的精细细节,而低频信息受影响相对较小。这一观察结果促使我们通过利用离散小波变换(DWT)探索一种用于SIRR的频率感知方法。小波分解使我们能够在频域中区分和隔离反射伪像,同时保留透射信息。基于这一见解,我们提出了一种新颖的小波引导深度展开网络(WDUNet),它利用小波分解和深度展开技术的优势来提高SIRR中的可解释性和泛化能力。具体来说,我们使用DWT和卷积字典制定了一个基于优化的反射去除模型。所提出的模型通过近端梯度算法进行优化,然后展开为一个神经网络架构,其中所有参数在训练期间端到端学习。通过将小波域分析与深度展开相结合,WDUNet增强了SIRR方法的可解释性和泛化能力。此外,我们设计并将低频参数估计模块(LPEM)和高频参数估计模块(HPEM)集成到WDUNet中,使网络能够自动学习和优化模型的超参数。在四个基准数据集上进行的广泛实验表明,WDUNet在客观评估指标和主观视觉质量方面均持续优于现有的最先进方法。