TTVFI：用于视频帧插值的学习轨迹感知Transformer

TTVFI: Learning Trajectory-Aware Transformer for Video Frame Interpolation.

作者信息

Liu Chengxu, Yang Huan, Fu Jianlong, Qian Xueming

出版信息

IEEE Trans Image Process. 2023;32:4728-4741. doi: 10.1109/TIP.2023.3302990. Epub 2023 Aug 22.

DOI:10.1109/TIP.2023.3302990

Abstract

Video frame interpolation (VFI) aims to synthesize an intermediate frame between two consecutive frames. State-of-the-art approaches usually adopt a two-step solution, which includes 1) generating locally-warped pixels by calculating the optical flow based on pre-defined motion patterns (e.g., uniform motion, symmetric motion), 2) blending the warped pixels to form a full frame through deep neural synthesis networks. However, for various complicated motions (e.g., non-uniform motion, turn around), such improper assumptions about pre-defined motion patterns introduce the inconsistent warping from the two consecutive frames. This leads to the warped features for new frames are usually not aligned, yielding distortion and blur, especially when large and complex motions occur. To solve this issue, in this paper we propose a novel Trajectory-aware Transformer for Video Frame Interpolation (TTVFI). In particular, we formulate the warped features with inconsistent motions as query tokens, and formulate relevant regions in a motion trajectory from two original consecutive frames into keys and values. Self-attention is learned on relevant tokens along the trajectory to blend the pristine features into intermediate frames through end-to-end training. Experimental results demonstrate that our method outperforms other state-of-the-art methods in four widely-used VFI benchmarks. Both code and pre-trained models will be released at https://github.com/ChengxuLiu/TTVFI.

摘要

视频帧插值（VFI）旨在合成两个连续帧之间的中间帧。当前的先进方法通常采用两步解决方案，其中包括：1）通过基于预定义运动模式（例如，匀速运动、对称运动）计算光流来生成局部扭曲像素；2）通过深度神经合成网络混合扭曲像素以形成完整帧。然而，对于各种复杂运动（例如，非匀速运动、转身），关于预定义运动模式的这种不当假设会导致两个连续帧产生不一致的扭曲。这使得新帧的扭曲特征通常无法对齐，从而产生失真和模糊，尤其是在发生大的复杂运动时。为了解决这个问题，在本文中我们提出了一种用于视频帧插值的新型轨迹感知Transformer（TTVFI）。具体而言，我们将具有不一致运动的扭曲特征表述为查询令牌，并将来自两个原始连续帧的运动轨迹中的相关区域表述为键和值。通过沿轨迹对相关令牌学习自注意力，以通过端到端训练将原始特征融合到中间帧中。实验结果表明，我们的方法在四个广泛使用的VFI基准测试中优于其他现有方法。代码和预训练模型都将在https://github.com/ChengxuLiu/TTVFI上发布。

相似文献

TTVFI: Learning Trajectory-Aware Transformer for Video Frame Interpolation.TTVFI：用于视频帧插值的学习轨迹感知Transformer

IEEE Trans Image Process. 2023;32:4728-4741. doi: 10.1109/TIP.2023.3302990. Epub 2023 Aug 22.

JNMR: Joint Non-Linear Motion Regression for Video Frame Interpolation.JNMR：用于视频帧插值的联合非线性运动回归

IEEE Trans Image Process. 2023;32:5283-5295. doi: 10.1109/TIP.2023.3315122. Epub 2023 Sep 22.

Motion-Aware Video Frame Interpolation.运动感知视频帧插值

Neural Netw. 2024 Oct;178:106433. doi: 10.1016/j.neunet.2024.106433. Epub 2024 Jun 14.

VRT: A Video Restoration Transformer.VRT：一种视频恢复Transformer。

IEEE Trans Image Process. 2024;33:2171-2182. doi: 10.1109/TIP.2024.3372454. Epub 2024 Mar 22.

Multiple Video Frame Interpolation via Enhanced Deformable Separable Convolution.通过增强型可变形可分离卷积实现多视频帧插值

IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):7029-7045. doi: 10.1109/TPAMI.2021.3100714. Epub 2022 Sep 14.

SuperFast: 200× Video Frame Interpolation via Event Camera.超快速：基于事件相机的 200× 视频帧插补。

IEEE Trans Pattern Anal Mach Intell. 2023 Jun;45(6):7764-7780. doi: 10.1109/TPAMI.2022.3224051. Epub 2023 May 5.

Edge-Aware Network for Flow-Based Video Frame Interpolation.用于基于流的视频帧插值的边缘感知网络。

IEEE Trans Neural Netw Learn Syst. 2022 Jun 8;PP. doi: 10.1109/TNNLS.2022.3178281.

MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement.MEMC-Net：用于视频插值与增强的运动估计和运动补偿驱动神经网络。

IEEE Trans Pattern Anal Mach Intell. 2021 Mar;43(3):933-948. doi: 10.1109/TPAMI.2019.2941941. Epub 2021 Feb 4.

Content-Aware Warping for View Synthesis.基于内容的视图合成变形。

IEEE Trans Pattern Anal Mach Intell. 2023 Aug;45(8):9486-9503. doi: 10.1109/TPAMI.2023.3242709. Epub 2023 Jun 30.

Video Summarization With Spatiotemporal Vision Transformer.基于时空视觉Transformer 的视频摘要

IEEE Trans Image Process. 2023;32:3013-3026. doi: 10.1109/TIP.2023.3275069. Epub 2023 May 26.

TTVFI：用于视频帧插值的学习轨迹感知Transformer

TTVFI: Learning Trajectory-Aware Transformer for Video Frame Interpolation.

作者信息

Liu Chengxu, Yang Huan, Fu Jianlong, Qian Xueming

出版信息

IEEE Trans Image Process. 2023;32:4728-4741. doi: 10.1109/TIP.2023.3302990. Epub 2023 Aug 22.

DOI:10.1109/TIP.2023.3302990

PMID:37566503

Abstract

摘要

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

TTVFI：用于视频帧插值的学习轨迹感知Transformer

TTVFI: Learning Trajectory-Aware Transformer for Video Frame Interpolation.

作者信息

出版信息

相似文献

TTVFI：用于视频帧插值的学习轨迹感知Transformer

TTVFI: Learning Trajectory-Aware Transformer for Video Frame Interpolation.

作者信息

出版信息

相似文献