Lucas Alice, Lopez-Tapia Santiago, Molina Rafael, Katsaggelos Aggelos K
IEEE Trans Image Process. 2019 Jul;28(7):3312-3327. doi: 10.1109/TIP.2019.2895768. Epub 2019 Jan 29.
Video super-resolution (VSR) has become one of the most critical problems in video processing. In the deep learning literature, recent works have shown the benefits of using adversarial-based and perceptual losses to improve the performance on various image restoration tasks; however, these have yet to be applied for video super-resolution. In this paper, we propose a generative adversarial network (GAN)-based formulation for VSR. We introduce a new generator network optimized for the VSR problem, named VSRResNet, along with new discriminator architecture to properly guide VSRResNet during the GAN training. We further enhance our VSR GAN formulation with two regularizers, a distance loss in feature-space and pixel-space, to obtain our final VSRResFeatGAN model. We show that pre-training our generator with the mean-squared-error loss only quantitatively surpasses the current state-of-the-art VSR models. Finally, we employ the PercepDist metric to compare the state-of-the-art VSR models. We show that this metric more accurately evaluates the perceptual quality of SR solutions obtained from neural networks, compared with the commonly used PSNR/SSIM metrics. Finally, we show that our proposed model, the VSRResFeatGAN model, outperforms the current state-of-the-art SR models, both quantitatively and qualitatively.
视频超分辨率(VSR)已成为视频处理中最关键的问题之一。在深度学习文献中,近期的研究表明,使用基于对抗和感知损失有助于提高各种图像恢复任务的性能;然而,这些方法尚未应用于视频超分辨率。在本文中,我们提出了一种基于生成对抗网络(GAN)的VSR公式。我们引入了一种针对VSR问题优化的新生成器网络,名为VSRResNet,以及新的判别器架构,以便在GAN训练期间正确引导VSRResNet。我们通过两个正则化项进一步增强我们的VSR GAN公式,即特征空间和像素空间中的距离损失,以获得我们最终的VSRResFeatGAN模型。我们表明,仅使用均方误差损失对生成器进行预训练,在定量上仅略优于当前最先进的VSR模型。最后,我们使用PercepDist度量来比较最先进的VSR模型。我们表明,与常用的PSNR/SSIM度量相比,该度量能更准确地评估从神经网络获得的超分辨率解决方案的感知质量。最后,我们表明,我们提出的模型VSRResFeatGAN模型在定量和定性上均优于当前最先进的超分辨率模型。