• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

TTVFI:用于视频帧插值的学习轨迹感知Transformer

TTVFI: Learning Trajectory-Aware Transformer for Video Frame Interpolation.

作者信息

Liu Chengxu, Yang Huan, Fu Jianlong, Qian Xueming

出版信息

IEEE Trans Image Process. 2023;32:4728-4741. doi: 10.1109/TIP.2023.3302990. Epub 2023 Aug 22.

DOI:10.1109/TIP.2023.3302990
PMID:37566503
Abstract

Video frame interpolation (VFI) aims to synthesize an intermediate frame between two consecutive frames. State-of-the-art approaches usually adopt a two-step solution, which includes 1) generating locally-warped pixels by calculating the optical flow based on pre-defined motion patterns (e.g., uniform motion, symmetric motion), 2) blending the warped pixels to form a full frame through deep neural synthesis networks. However, for various complicated motions (e.g., non-uniform motion, turn around), such improper assumptions about pre-defined motion patterns introduce the inconsistent warping from the two consecutive frames. This leads to the warped features for new frames are usually not aligned, yielding distortion and blur, especially when large and complex motions occur. To solve this issue, in this paper we propose a novel Trajectory-aware Transformer for Video Frame Interpolation (TTVFI). In particular, we formulate the warped features with inconsistent motions as query tokens, and formulate relevant regions in a motion trajectory from two original consecutive frames into keys and values. Self-attention is learned on relevant tokens along the trajectory to blend the pristine features into intermediate frames through end-to-end training. Experimental results demonstrate that our method outperforms other state-of-the-art methods in four widely-used VFI benchmarks. Both code and pre-trained models will be released at https://github.com/ChengxuLiu/TTVFI.

摘要

视频帧插值(VFI)旨在合成两个连续帧之间的中间帧。当前的先进方法通常采用两步解决方案,其中包括:1)通过基于预定义运动模式(例如,匀速运动、对称运动)计算光流来生成局部扭曲像素;2)通过深度神经合成网络混合扭曲像素以形成完整帧。然而,对于各种复杂运动(例如,非匀速运动、转身),关于预定义运动模式的这种不当假设会导致两个连续帧产生不一致的扭曲。这使得新帧的扭曲特征通常无法对齐,从而产生失真和模糊,尤其是在发生大的复杂运动时。为了解决这个问题,在本文中我们提出了一种用于视频帧插值的新型轨迹感知Transformer(TTVFI)。具体而言,我们将具有不一致运动的扭曲特征表述为查询令牌,并将来自两个原始连续帧的运动轨迹中的相关区域表述为键和值。通过沿轨迹对相关令牌学习自注意力,以通过端到端训练将原始特征融合到中间帧中。实验结果表明,我们的方法在四个广泛使用的VFI基准测试中优于其他现有方法。代码和预训练模型都将在https://github.com/ChengxuLiu/TTVFI上发布。

相似文献

1
TTVFI: Learning Trajectory-Aware Transformer for Video Frame Interpolation.TTVFI:用于视频帧插值的学习轨迹感知Transformer
IEEE Trans Image Process. 2023;32:4728-4741. doi: 10.1109/TIP.2023.3302990. Epub 2023 Aug 22.
2
JNMR: Joint Non-Linear Motion Regression for Video Frame Interpolation.JNMR:用于视频帧插值的联合非线性运动回归
IEEE Trans Image Process. 2023;32:5283-5295. doi: 10.1109/TIP.2023.3315122. Epub 2023 Sep 22.
3
Motion-Aware Video Frame Interpolation.运动感知视频帧插值
Neural Netw. 2024 Oct;178:106433. doi: 10.1016/j.neunet.2024.106433. Epub 2024 Jun 14.
4
VRT: A Video Restoration Transformer.VRT:一种视频恢复Transformer。
IEEE Trans Image Process. 2024;33:2171-2182. doi: 10.1109/TIP.2024.3372454. Epub 2024 Mar 22.
5
Multiple Video Frame Interpolation via Enhanced Deformable Separable Convolution.通过增强型可变形可分离卷积实现多视频帧插值
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):7029-7045. doi: 10.1109/TPAMI.2021.3100714. Epub 2022 Sep 14.
6
SuperFast: 200× Video Frame Interpolation via Event Camera.超快速:基于事件相机的 200× 视频帧插补。
IEEE Trans Pattern Anal Mach Intell. 2023 Jun;45(6):7764-7780. doi: 10.1109/TPAMI.2022.3224051. Epub 2023 May 5.
7
Edge-Aware Network for Flow-Based Video Frame Interpolation.用于基于流的视频帧插值的边缘感知网络。
IEEE Trans Neural Netw Learn Syst. 2022 Jun 8;PP. doi: 10.1109/TNNLS.2022.3178281.
8
MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement.MEMC-Net:用于视频插值与增强的运动估计和运动补偿驱动神经网络。
IEEE Trans Pattern Anal Mach Intell. 2021 Mar;43(3):933-948. doi: 10.1109/TPAMI.2019.2941941. Epub 2021 Feb 4.
9
Content-Aware Warping for View Synthesis.基于内容的视图合成变形。
IEEE Trans Pattern Anal Mach Intell. 2023 Aug;45(8):9486-9503. doi: 10.1109/TPAMI.2023.3242709. Epub 2023 Jun 30.
10
Video Summarization With Spatiotemporal Vision Transformer.基于时空视觉Transformer 的视频摘要
IEEE Trans Image Process. 2023;32:3013-3026. doi: 10.1109/TIP.2023.3275069. Epub 2023 May 26.