Zhao Bin, Li Xuelong
IEEE Trans Neural Netw Learn Syst. 2022 Jun 8;PP. doi: 10.1109/TNNLS.2022.3178281.
Video frame interpolation can up-convert the frame rate and enhance the video quality. In recent years, although interpolation performance has achieved great success, image blur usually occurs at object boundaries owing to the large motion. It has been a long-standing problem and has not been addressed yet. In this brief, we propose to reduce the image blur and get the clear shape of objects by preserving the edges in the interpolated frames. To this end, the proposed edge-aware network (EA-Net) integrates the edge information into the frame interpolation task. It follows an end-to-end architecture and can be separated into two stages, i.e., edge-guided flow estimation and edge-protected frame synthesis. Specifically, in the flow estimation stage, three edge-aware mechanisms are developed to emphasize the frame edges in estimating flow maps, so that the edge maps are taken as auxiliary information to provide more guidance to boost the flow accuracy. In the frame synthesis stage, the flow refinement module is designed to refine the flow map, and the attention module is carried out to adaptively focus on the bidirectional flow maps when synthesizing the intermediate frames. Furthermore, the frame and edge discriminators are adopted to conduct the adversarial training strategy, so as to enhance the reality and clarity of synthesized frames. Experiments on three benchmarks, including Vimeo90k, UCF101 for single-frame interpolation, and Adobe240-fps for multiframe interpolation, have demonstrated the superiority of the proposed EA-Net for the video frame interpolation task.
视频帧插值可以提升帧率并提高视频质量。近年来,尽管插值性能已取得巨大成功,但由于运动幅度大,图像模糊通常会出现在物体边界处。这一直是个长期存在的问题且尚未得到解决。在本简报中,我们提议通过保留插值帧中的边缘来减少图像模糊并获得清晰的物体形状。为此,所提出的边缘感知网络(EA-Net)将边缘信息集成到帧插值任务中。它采用端到端架构,可分为两个阶段,即边缘引导的光流估计和边缘保护的帧合成。具体而言,在光流估计阶段,开发了三种边缘感知机制以在估计光流图时强调帧边缘,从而将边缘图作为辅助信息来提供更多指导以提高光流准确性。在帧合成阶段,设计了光流细化模块来细化光流图,并在合成中间帧时执行注意力模块以自适应地聚焦于双向光流图。此外,采用帧判别器和边缘判别器来实施对抗训练策略,以增强合成帧的真实感和清晰度。在包括用于单帧插值的Vimeo90k、UCF101以及用于多帧插值的Adobe240-fps这三个基准数据集上进行的实验证明了所提出的EA-Net在视频帧插值任务中的优越性。