• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于时空表征学习的自监督运动感知

Self-Supervised Motion Perception for Spatiotemporal Representation Learning.

作者信息

Liu Chang, Yao Yuan, Luo Dezhao, Zhou Yu, Ye Qixiang

出版信息

IEEE Trans Neural Netw Learn Syst. 2023 Dec;34(12):9832-9846. doi: 10.1109/TNNLS.2022.3160860. Epub 2023 Nov 30.

DOI:10.1109/TNNLS.2022.3160860
PMID:35358053
Abstract

In this study, we propose a novel pretext task and a self-supervised motion perception (SMP) method for spatiotemporal representation learning. The pretext task is defined as video playback rate perception, which utilizes temporal dilated sampling to augment video clips to multiple duplicates of different temporal resolutions. The SMP method is built upon discriminative and generative motion perception models, which capture representations related to motion dynamics and appearance from video clips of multiple temporal resolutions in a collaborative fashion. To enhance the collaboration, we further propose difference and convolution motion attention (MA), which drives the generative model focusing on motion-related appearance, and leverage multiple granularity perception (MG) to extract accurate motion dynamics. Extensive experiments demonstrate SMP's effectiveness for video motion perception and state-of-the-art performance of self-supervised representation models upon target tasks, including action recognition and video retrieval. Code for SMP is available at github.com/yuanyao366/SMP.

摘要

在本研究中,我们提出了一种用于时空表征学习的新型预训练任务和一种自监督运动感知(SMP)方法。该预训练任务被定义为视频播放速率感知,它利用时间膨胀采样将视频片段扩充为具有不同时间分辨率的多个副本。SMP方法基于判别式和生成式运动感知模型构建,这些模型以协作方式从多个时间分辨率的视频片段中捕获与运动动态和外观相关的表征。为了增强协作,我们进一步提出了差分与卷积运动注意力(MA),它驱动生成模型关注与运动相关的外观,并利用多粒度感知(MG)来提取准确的运动动态。大量实验证明了SMP在视频运动感知方面的有效性,以及自监督表征模型在包括动作识别和视频检索等目标任务上的领先性能。SMP的代码可在github.com/yuanyao366/SMP获取。

相似文献

1
Self-Supervised Motion Perception for Spatiotemporal Representation Learning.用于时空表征学习的自监督运动感知
IEEE Trans Neural Netw Learn Syst. 2023 Dec;34(12):9832-9846. doi: 10.1109/TNNLS.2022.3160860. Epub 2023 Nov 30.
2
Cross-view motion consistent self-supervised video inter-intra contrastive for action representation understanding.跨视图运动一致的自我监督视频内-外对比动作表示理解。
Neural Netw. 2024 Nov;179:106578. doi: 10.1016/j.neunet.2024.106578. Epub 2024 Jul 26.
3
Self-Supervised Video Representation Learning by Uncovering Spatio-Temporal Statistics.自监督视频表示学习:揭示时空统计信息。
IEEE Trans Pattern Anal Mach Intell. 2022 Jul;44(7):3791-3806. doi: 10.1109/TPAMI.2021.3057833. Epub 2022 Jun 3.
4
TCGL: Temporal Contrastive Graph for Self-Supervised Video Representation Learning.TCGL:用于自监督视频表征学习的时间对比图
IEEE Trans Image Process. 2022;31:1978-1993. doi: 10.1109/TIP.2022.3147032. Epub 2022 Feb 18.
5
Self-Supervised Representation Learning With Spatial-Temporal Consistency for Sign Language Recognition.用于手语识别的具有时空一致性的自监督表征学习
IEEE Trans Image Process. 2024;33:4188-4201. doi: 10.1109/TIP.2024.3416881. Epub 2024 Jul 17.
6
Multi-Task Collaborative Network: Bridge the Supervised and Self-Supervised Learning for EEG Classification in RSVP Tasks.多任务协作网络:用于 RSVP 任务中 EEG 分类的有监督和自监督学习的桥梁。
IEEE Trans Neural Syst Rehabil Eng. 2024;32:638-651. doi: 10.1109/TNSRE.2024.3357863. Epub 2024 Feb 1.
7
Multi-Granularity Anchor-Contrastive Representation Learning for Semi-Supervised Skeleton-Based Action Recognition.多粒度锚点对比学习在半监督骨架动作识别中的应用
IEEE Trans Pattern Anal Mach Intell. 2023 Jun;45(6):7559-7576. doi: 10.1109/TPAMI.2022.3222871. Epub 2023 May 5.
8
Spatial Pyramid Covariance-Based Compact Video Code for Robust Face Retrieval in TV-Series.基于空间金字塔协方差的紧凑视频编码用于电视剧中鲁棒的人脸检索
IEEE Trans Image Process. 2016 Dec;25(12):5905-5919. doi: 10.1109/TIP.2016.2616297. Epub 2016 Oct 10.
9
A multi-scale self-supervised hypergraph contrastive learning framework for video question answering.一种用于视频问答的多尺度自监督超图对比学习框架。
Neural Netw. 2023 Nov;168:272-286. doi: 10.1016/j.neunet.2023.08.057. Epub 2023 Sep 16.
10
Contrastive Self-Supervised Pre-Training for Video Quality Assessment.基于对比自监督预训练的视频质量评估
IEEE Trans Image Process. 2022;31:458-471. doi: 10.1109/TIP.2021.3130536. Epub 2021 Dec 16.