• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于少样本动作识别的跨模态对比学习网络

Cross-Modal Contrastive Learning Network for Few-Shot Action Recognition.

作者信息

Wang Xiao, Yan Yan, Hu Hai-Miao, Li Bo, Wang Hanzi

出版信息

IEEE Trans Image Process. 2024;33:1257-1271. doi: 10.1109/TIP.2024.3354104. Epub 2024 Feb 13.

DOI:10.1109/TIP.2024.3354104
PMID:38252570
Abstract

Few-shot action recognition aims to recognize new unseen categories with only a few labeled samples of each class. However, it still suffers from the limitation of inadequate data, which easily leads to the overfitting and low-generalization problems. Therefore, we propose a cross-modal contrastive learning network (CCLN), consisting of an adversarial branch and a contrastive branch, to perform effective few-shot action recognition. In the adversarial branch, we elaborately design a prototypical generative adversarial network (PGAN) to obtain synthesized samples for increasing training samples, which can mitigate the data scarcity problem and thereby alleviate the overfitting problem. When the training samples are limited, the obtained visual features are usually suboptimal for video understanding as they lack discriminative information. To address this issue, in the contrastive branch, we propose a cross-modal contrastive learning module (CCLM) to obtain discriminative feature representations of samples with the help of semantic information, which can enable the network to enhance the feature learning ability at the class-level. Moreover, since videos contain crucial sequences and ordering information, thus we introduce a spatial-temporal enhancement module (SEM) to model the spatial context within video frames and the temporal context across video frames. The experimental results show that the proposed CCLN outperforms the state-of-the-art few-shot action recognition methods on four challenging benchmarks, including Kinetics, UCF101, HMDB51 and SSv2.

摘要

少样本动作识别旨在仅利用每个类别的少量标记样本识别新的未见类别。然而,它仍然受到数据不足的限制,这很容易导致过拟合和泛化能力低的问题。因此,我们提出了一种跨模态对比学习网络(CCLN),它由一个对抗分支和一个对比分支组成,以执行有效的少样本动作识别。在对抗分支中,我们精心设计了一个原型生成对抗网络(PGAN)来获取合成样本以增加训练样本,这可以缓解数据稀缺问题,从而减轻过拟合问题。当训练样本有限时,所获得的视觉特征通常因缺乏判别信息而对于视频理解而言并非最优。为了解决这个问题,在对比分支中,我们提出了一个跨模态对比学习模块(CCLM),以借助语义信息获得样本的判别特征表示,这可以使网络在类级别增强特征学习能力。此外,由于视频包含关键序列和顺序信息,因此我们引入了一个时空增强模块(SEM)来对视频帧内的空间上下文和跨视频帧的时间上下文进行建模。实验结果表明,所提出的CCLN在包括Kinetics、UCF101、HMDB51和SSv2在内的四个具有挑战性的基准测试中优于当前最先进的少样本动作识别方法。

相似文献

1
Cross-Modal Contrastive Learning Network for Few-Shot Action Recognition.用于少样本动作识别的跨模态对比学习网络
IEEE Trans Image Process. 2024;33:1257-1271. doi: 10.1109/TIP.2024.3354104. Epub 2024 Feb 13.
2
Contrastive Prototype-Guided Generation for Generalized Zero-Shot Learning.基于对比原型引导的广义零样本学习生成方法。
Neural Netw. 2024 Aug;176:106324. doi: 10.1016/j.neunet.2024.106324. Epub 2024 Apr 15.
3
Augmented semantic feature based generative network for generalized zero-shot learning.基于增强语义特征的生成网络用于广义零样本学习。
Neural Netw. 2021 Nov;143:1-11. doi: 10.1016/j.neunet.2021.04.014. Epub 2021 Apr 21.
4
Few-shot disease recognition algorithm based on supervised contrastive learning.基于监督对比学习的少样本疾病识别算法
Front Plant Sci. 2024 Feb 7;15:1341831. doi: 10.3389/fpls.2024.1341831. eCollection 2024.
5
Leveraging Balanced Semantic Embedding for Generative Zero-Shot Learning.利用平衡语义嵌入进行生成式零样本学习。
IEEE Trans Neural Netw Learn Syst. 2023 Nov;34(11):9575-9582. doi: 10.1109/TNNLS.2022.3208525. Epub 2023 Oct 27.
6
Semantics-Guided Contrastive Network for Zero-Shot Object Detection.用于零样本目标检测的语义引导对比网络
IEEE Trans Pattern Anal Mach Intell. 2024 Mar;46(3):1530-1544. doi: 10.1109/TPAMI.2021.3140070. Epub 2024 Feb 6.
7
Fine-Grained Feature Generation for Generalized Zero-Shot Video Classification.用于广义零样本视频分类的细粒度特征生成
IEEE Trans Image Process. 2023;32:1599-1612. doi: 10.1109/TIP.2023.3247167. Epub 2023 Mar 6.
8
A Pairwise Attentive Adversarial Spatiotemporal Network for Cross-Domain Few-Shot Action Recognition-R2.用于跨域少样本动作识别的成对注意力对抗时空网络 - R2
IEEE Trans Image Process. 2021;30:767-782. doi: 10.1109/TIP.2020.3038372. Epub 2020 Dec 4.
9
SCL: Self-supervised contrastive learning for few-shot image classification.SCL:基于自监督对比学习的少样本图像分类。
Neural Netw. 2023 Aug;165:19-30. doi: 10.1016/j.neunet.2023.05.037. Epub 2023 May 24.
10
Transformer-Based Approach Via Contrastive Learning for Zero-Shot Detection.基于对比学习的零样本检测的Transformer 方法。
Int J Neural Syst. 2023 Jul;33(7):2350035. doi: 10.1142/S0129065723500351. Epub 2023 Jun 14.