Liang Jingyun, Cao Jiezhang, Fan Yuchen, Zhang Kai, Ranjan Rakesh, Li Yawei, Timofte Radu, Van Gool Luc
IEEE Trans Image Process. 2024;33:2171-2182. doi: 10.1109/TIP.2024.3372454. Epub 2024 Mar 22.
Video restoration aims to restore high-quality frames from low-quality frames. Different from single image restoration, video restoration generally requires to utilize temporal information from multiple adjacent but usually misaligned video frames. Existing deep methods generally tackle with this by exploiting a sliding window strategy or a recurrent architecture, which are restricted by frame-by-frame restoration. In this paper, we propose a Video Restoration Transformer (VRT) with parallel frame prediction ability. More specifically, VRT is composed of multiple scales, each of which consists of two kinds of modules: temporal reciprocal self attention (TRSA) and parallel warping. TRSA divides the video into small clips, on which reciprocal attention is applied for joint motion estimation, feature alignment and feature fusion, while self attention is used for feature extraction. To enable cross-clip interactions, the video sequence is shifted for every other layer. Besides, parallel warping is used to further fuse information from neighboring frames by parallel feature warping. Experimental results on five tasks, including video super-resolution, video deblurring, video denoising, video frame interpolation and space-time video super-resolution, demonstrate that VRT outperforms the state-of-the-art methods by large margins (up to 2.16dB) on fourteen benchmark datasets. The codes are available at https://github.com/JingyunLiang/VRT.
视频恢复旨在从低质量帧中恢复高质量帧。与单图像恢复不同,视频恢复通常需要利用多个相邻但通常未对齐的视频帧中的时间信息。现有的深度方法通常通过采用滑动窗口策略或循环架构来解决此问题,这些方法受逐帧恢复的限制。在本文中,我们提出了一种具有并行帧预测能力的视频恢复Transformer(VRT)。具体而言,VRT由多个尺度组成,每个尺度由两种模块组成:时间互反自注意力(TRSA)和平行扭曲。TRSA将视频划分为小片段,在这些片段上应用互反注意力进行联合运动估计、特征对齐和特征融合,同时使用自注意力进行特征提取。为了实现跨片段交互,视频序列每隔一层进行移位。此外,并行扭曲用于通过并行特征扭曲进一步融合相邻帧的信息。在包括视频超分辨率、视频去模糊、视频去噪、视频帧插值和时空视频超分辨率在内的五个任务上的实验结果表明,VRT在14个基准数据集上大幅优于现有方法(高达2.16dB)。代码可在https://github.com/JingyunLiang/VRT获取。