Optical Flow as Spatial-Temporal Attention Learners. - Suppr | 超能文献

文献检索
文档翻译
深度研究
学术资讯

Zotero 插件

邀请有礼
套餐&价格
历史记录

应用&插件

Zotero 插件浏览器插件 Mac 客户端 Windows 客户端微信小程序

定价

高级版会员购买积分包购买API积分包

服务

文献检索文档翻译深度研究 API 文档 MCP 服务

关于我们

关于 Suppr 公司介绍联系我们用户协议隐私条款

关注我们

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

Optical Flow as Spatial-Temporal Attention Learners.

作者信息

Lu Yawen, Han Cheng, Wang Qifan, Fan Heng, Kong Zhaodan, Liu Dongfang, Chen Yingjie

出版信息

IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):11491-11506. doi: 10.1109/TPAMI.2024.3463648. Epub 2024 Nov 6.

DOI:10.1109/TPAMI.2024.3463648

Abstract

Optical flow is an indispensable building block for various important computer vision tasks, including motion estimation, object tracking, and disparity measurement. To date, the dominant methods are CNN-based, leaving plenty of room for improvement. In this work, we propose TransFlow, a transformer architecture for optical flow estimation. Compared to dominant CNN-based methods, TransFlow demonstrates three advantages. First, it provides more accurate correlation and trustworthy matching in flow estimation by utilizing spatial self-attention and cross-attention mechanisms between adjacent frames to effectively capture global dependencies; Second, it recovers more compromised information (e.g., occlusion and motion blur) in flow estimation through long-range temporal association in dynamic scenes; Third, it introduces a concise self-learning paradigm, eliminating the need for complex and laborious multi-stage pre-training procedures. The versatility and superiority of TransFlow extend seamlessly to 3D scene motion, yielding competitive outcomes in 3D scene flow estimation. Our approach attains state-of-the-art results on benchmark datasets such as Sintel and KITTI-15, while also exhibiting exceptional performance on downstream tasks, including video object detection using the ImageNet VID dataset, video frame interpolation using the GoPro dataset, and video stabilization using the DeepStab dataset. We believe that the effectiveness of TransFlow positions it as a flexible baseline for both optical flow and scene flow estimation, offering promising avenues for future research and development.

摘要

相似文献

1

Optical Flow as Spatial-Temporal Attention Learners.

IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):11491-11506. doi: 10.1109/TPAMI.2024.3463648. Epub 2024 Nov 6.

2

A New Parallel Intelligence Based Light Field Dataset for Depth Refinement and Scene Flow Estimation.基于新型平行智能的用于深度细化和场景流估计的光场数据集。

Sensors (Basel). 2022 Dec 4;22(23):9483. doi: 10.3390/s22239483.

3

Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding.每一个像素都很重要++：通过3D整体理解进行几何与运动的联合学习。

IEEE Trans Pattern Anal Mach Intell. 2019 Jul 23. doi: 10.1109/TPAMI.2019.2930258.

4

Unsupervised Learning of Optical Flow With CNN-based Non-Local Filtering.基于卷积神经网络的非局部滤波的光流无监督学习

IEEE Trans Image Process. 2020 Aug 5;PP. doi: 10.1109/TIP.2020.3013168.

5

Attention-Guided Disentangled Feature Aggregation for Video Object Detection.面向视频目标检测的注意力引导解缠特征聚合。

Sensors (Basel). 2022 Nov 7;22(21):8583. doi: 10.3390/s22218583.

6

Kalman-Based Scene Flow Estimation for Point Cloud Densification and 3D Object Detection in Dynamic Scenes.基于卡尔曼滤波的动态场景点云致密化与三维目标检测的场景流估计

Sensors (Basel). 2024 Jan 31;24(3):916. doi: 10.3390/s24030916.

7

DSTAN: A Deformable Spatial-temporal Attention Network with Bidirectional Sequence Feature Refinement for Speckle Noise Removal in Thyroid Ultrasound Video.DSTAN：一种具有双向序列特征细化的可变形时空注意力网络，用于去除甲状腺超声视频中的斑点噪声。

J Imaging Inform Med. 2024 Dec;37(6):3264-3281. doi: 10.1007/s10278-023-00935-5. Epub 2024 Jun 5.

8

Multi-Stage Network for Event-Based Video Deblurring with Residual Hint Attention.基于残差提示注意力的多阶段事件视频去模糊网络。

Sensors (Basel). 2023 Mar 7;23(6):2880. doi: 10.3390/s23062880.

9

TransVOD: End-to-End Video Object Detection With Spatial-Temporal Transformers.TransVOD：基于时空变换的端到端视频目标检测

IEEE Trans Pattern Anal Mach Intell. 2023 Jun;45(6):7853-7869. doi: 10.1109/TPAMI.2022.3223955. Epub 2023 May 5.

10

Joint Stereo Video Deblurring, Scene Flow Estimation and Moving Object Segmentation.联合立体视频去模糊、场景流估计与运动目标分割

IEEE Trans Image Process. 2019 Oct 11. doi: 10.1109/TIP.2019.2945867.