IEEE Trans Image Process. 2022;31:3440-3448. doi: 10.1109/TIP.2022.3170726. Epub 2022 May 11.
Existing adherent raindrop removal methods focus on the detection of the raindrop locations, and then use inpainting techniques or generative networks to recover the background behind raindrops. Yet, as adherent raindrops are diverse in sizes and appearances, the detection is challenging for both single image and video. Moreover, unlike rain streaks, adherent raindrops tend to cover the same area in several frames. Addressing these problems, our method employs a two-stage video-based raindrop removal method. The first stage is the single image module, which generates initial clean results. The second stage is the multiple frame module, which further refines the initial results using temporal constraints, namely, by utilizing multiple input frames in our process and applying temporal consistency between adjacent output frames. Our single image module employs a raindrop removal network to generate initial raindrop removal results, and create a mask representing the differences between the input and initial output. Once the masks and initial results for consecutive frames are obtained, our multiple-frame module aligns the frames in both the image and feature levels and then obtains the clean background. Our method initially employs optical flow to align the frames, and then utilizes deformable convolution layers further to achieve feature-level frame alignment. To remove small raindrops and recover correct backgrounds, a target frame is predicted from adjacent frames. A series of unsupervised losses are proposed so that our second stage, which is the video raindrop removal module, can self-learn from video data without ground truths. Experimental results on real videos demonstrate the state-of-art performance of our method both quantitatively and qualitatively.
现有的附滴去除方法主要集中在检测雨滴位置上,然后使用图像修复技术或生成网络来恢复雨滴后面的背景。然而,由于附滴的大小和外观各不相同,无论是单张图像还是视频,检测都具有挑战性。此外,与雨线不同,附滴往往会在几帧中覆盖相同的区域。针对这些问题,我们的方法采用了基于视频的两阶段雨滴去除方法。第一阶段是单图像模块,它生成初始的清洁结果。第二阶段是多帧模块,它利用时间约束进一步细化初始结果,即在我们的过程中使用多个输入帧,并应用相邻输出帧之间的时间一致性。我们的单图像模块使用雨滴去除网络生成初始雨滴去除结果,并创建一个表示输入和初始输出之间差异的掩码。一旦获得了连续帧的掩码和初始结果,我们的多帧模块就会在图像和特征水平上对齐帧,然后获得清洁的背景。我们的方法最初使用光流对齐帧,然后使用可变形卷积层进一步实现特征级别的帧对齐。为了去除小的雨滴并恢复正确的背景,从相邻帧预测目标帧。提出了一系列无监督损失,以便我们的第二阶段,即视频雨滴去除模块,可以从视频数据中自我学习,而无需地面实况。在真实视频上的实验结果在定量和定性方面都证明了我们方法的最先进性能。