• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

跨视图运动一致的自我监督视频内-外对比动作表示理解。

Cross-view motion consistent self-supervised video inter-intra contrastive for action representation understanding.

机构信息

School of Information Science and Engineering, Yanshan University, Qinhuangdao, 066000, China; Hebei Key Laboratory of Information Transmission and Signal Processing, Qinhuangdao, 066000, China.

出版信息

Neural Netw. 2024 Nov;179:106578. doi: 10.1016/j.neunet.2024.106578. Epub 2024 Jul 26.

DOI:10.1016/j.neunet.2024.106578
PMID:39111158
Abstract

Self-supervised contrastive learning draws on power representational models to acquire generic semantic features from unlabeled data, and the key to training such models lies in how accurately to track motion features. Previous video contrastive learning methods have extensively used spatially or temporally augmentation as similar instances, resulting in models that are more likely to learn static backgrounds than motion features. To alleviate the background shortcuts, in this paper, we propose a cross-view motion consistent (CVMC) self-supervised video inter-intra contrastive model to focus on the learning of local details and long-term temporal relationships. Specifically, we first extract the dynamic features of consecutive video snippets and then align these features based on multi-view motion consistency. Meanwhile, we compare the optimized dynamic features for instance comparison of different videos and local spatial fine-grained with temporal order in the same video, respectively. Ultimately, the joint optimization of spatio-temporal alignment and motion discrimination effectively fills the challenges of the missing components of instance recognition, spatial compactness, and temporal perception in self-supervised learning. Experimental results show that our proposed self-supervised model can effectively learn visual representation information and achieve highly competitive performance compared to other state-of-the-art methods in both action recognition and video retrieval tasks.

摘要

自监督对比学习利用强大的表示模型从无标签数据中获取通用语义特征,而训练此类模型的关键在于如何准确地跟踪运动特征。以前的视频对比学习方法广泛使用空间或时间增强作为相似实例,导致模型更有可能学习静态背景而不是运动特征。为了缓解背景捷径,本文提出了一种跨视图运动一致(CVMC)自监督视频内-间对比模型,专注于学习局部细节和长期时间关系。具体来说,我们首先提取连续视频片段的动态特征,然后根据多视图运动一致性对齐这些特征。同时,我们比较优化后的动态特征,以分别对不同视频的实例进行比较,以及同一视频中的局部空间精细结构和时间顺序。最终,时空对齐和运动辨别联合优化有效地弥补了自监督学习中实例识别、空间紧凑性和时间感知缺失组件的挑战。实验结果表明,与其他最先进的方法相比,我们提出的自监督模型可以有效地学习视觉表示信息,并在动作识别和视频检索任务中取得极具竞争力的性能。

相似文献

1
Cross-view motion consistent self-supervised video inter-intra contrastive for action representation understanding.跨视图运动一致的自我监督视频内-外对比动作表示理解。
Neural Netw. 2024 Nov;179:106578. doi: 10.1016/j.neunet.2024.106578. Epub 2024 Jul 26.
2
TCGL: Temporal Contrastive Graph for Self-Supervised Video Representation Learning.TCGL:用于自监督视频表征学习的时间对比图
IEEE Trans Image Process. 2022;31:1978-1993. doi: 10.1109/TIP.2022.3147032. Epub 2022 Feb 18.
3
Local contrastive loss with pseudo-label based self-training for semi-supervised medical image segmentation.基于伪标签自训练的局部对比损失的半监督医学图像分割。
Med Image Anal. 2023 Jul;87:102792. doi: 10.1016/j.media.2023.102792. Epub 2023 Mar 11.
4
A multi-scale self-supervised hypergraph contrastive learning framework for video question answering.一种用于视频问答的多尺度自监督超图对比学习框架。
Neural Netw. 2023 Nov;168:272-286. doi: 10.1016/j.neunet.2023.08.057. Epub 2023 Sep 16.
5
Weakly supervised temporal action localization with actionness-guided false positive suppression.基于动作引导型假阳性抑制的弱监督时间动作定位。
Neural Netw. 2024 Jul;175:106307. doi: 10.1016/j.neunet.2024.106307. Epub 2024 Apr 15.
6
DANet: Semi-supervised differentiated auxiliaries guided network for video action recognition.DANet:用于视频动作识别的半监督差异化辅助引导网络。
Neural Netw. 2023 Jan;158:121-131. doi: 10.1016/j.neunet.2022.11.009. Epub 2022 Nov 17.
7
Self-Supervised Video Representation Learning by Uncovering Spatio-Temporal Statistics.自监督视频表示学习:揭示时空统计信息。
IEEE Trans Pattern Anal Mach Intell. 2022 Jul;44(7):3791-3806. doi: 10.1109/TPAMI.2021.3057833. Epub 2022 Jun 3.
8
Semi-supervised learning with progressive unlabeled data excavation for label-efficient surgical workflow recognition.基于渐进式未标记数据挖掘的半监督学习在标签高效手术流程识别中的应用。
Med Image Anal. 2021 Oct;73:102158. doi: 10.1016/j.media.2021.102158. Epub 2021 Jul 8.
9
Intra- and Inter-Slice Contrastive Learning for Point Supervised OCT Fluid Segmentation.基于点监督的 OCT 流体检索的切片内和切片间对比学习。
IEEE Trans Image Process. 2022;31:1870-1881. doi: 10.1109/TIP.2022.3148814. Epub 2022 Feb 16.
10
Adaptive self-supervised learning for sequential recommendation.自适应自监督学习在序列推荐中的应用。
Neural Netw. 2024 Nov;179:106570. doi: 10.1016/j.neunet.2024.106570. Epub 2024 Jul 24.