• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

BQN:基于运动带通模块实现动作识别的忙碌-安静网络

BQN: Busy-Quiet Net Enabled by Motion Band-Pass Module for Action Recognition.

作者信息

Huang Guoxi, Bors Adrian G

出版信息

IEEE Trans Image Process. 2022;31:4966-4979. doi: 10.1109/TIP.2022.3189810. Epub 2022 Aug 1.

DOI:10.1109/TIP.2022.3189810
PMID:35853053
Abstract

A rich video data representation can be realized by means of spatio-temporal frequency analysis. In this research study we show that a video can be disentangled, following the learning of video characteristics according to their spatio-temporal properties, into two complementary information components, dubbed Busy and Quiet. The Busy information characterizes the boundaries of moving regions, moving objects, or regions of change in movement. Meanwhile, the Quiet information encodes global smooth spatio-temporal structures defined by substantial redundancy. We design a trainable Motion Band-Pass Module (MBPM) for separating Busy and Quiet-defined information, in raw video data. We model a Busy-Quiet Net (BQN) by embedding the MBPM into a two-pathway CNN architecture. The efficiency of BQN is determined by avoiding redundancy in the feature spaces defined by the two pathways. While one pathway processes the Busy features, the other processes Quiet features at lower spatio-temporal resolutions reducing both memory and computational costs. Through experiments we show that the proposed MBPM can be used as a plug-in module in various CNN backbone architectures, significantly boosting their performance. The proposed BQN is shown to outperform many recent video models on Something-Something V1, Kinetics400, UCF101 and HMDB51 datasets.

摘要

通过时空频率分析可以实现丰富的视频数据表示。在本研究中,我们表明,在根据视频的时空特性学习视频特征之后,视频可以被分解为两个互补的信息成分,分别称为“忙碌”和“安静”。“忙碌”信息表征移动区域、移动物体或运动变化区域的边界。同时,“安静”信息对由大量冗余定义的全局平滑时空结构进行编码。我们设计了一个可训练的运动带通模块(MBPM),用于在原始视频数据中分离由“忙碌”和“安静”定义的信息。我们通过将MBPM嵌入双路径CNN架构来构建一个“忙碌-安静”网络(BQN)。BQN的效率取决于避免由两条路径定义的特征空间中的冗余。当一条路径处理“忙碌”特征时,另一条路径以较低的时空分辨率处理“安静”特征,从而降低内存和计算成本。通过实验我们表明,所提出的MBPM可以用作各种CNN骨干架构中的插件模块,显著提高其性能。在Something-Something V1、Kinetics400、UCF101和HMDB51数据集上,所提出的BQN表现优于许多近期的视频模型。

相似文献

1
BQN: Busy-Quiet Net Enabled by Motion Band-Pass Module for Action Recognition.BQN:基于运动带通模块实现动作识别的忙碌-安静网络
IEEE Trans Image Process. 2022;31:4966-4979. doi: 10.1109/TIP.2022.3189810. Epub 2022 Aug 1.
2
MEST: An Action Recognition Network with Motion Encoder and Spatio-Temporal Module.MEST:一种具有运动编码器和时空模块的动作识别网络。
Sensors (Basel). 2022 Sep 1;22(17):6595. doi: 10.3390/s22176595.
3
AMS-Net: Modeling Adaptive Multi-Granularity Spatio-Temporal Cues for Video Action Recognition.AMS-Net:用于视频动作识别的自适应多粒度时空线索建模
IEEE Trans Neural Netw Learn Syst. 2024 Dec;35(12):18731-18745. doi: 10.1109/TNNLS.2023.3321141. Epub 2024 Dec 2.
4
Action Recognition With Motion Diversification and Dynamic Selection.动作识别的运动多样化与动态选择。
IEEE Trans Image Process. 2022;31:4884-4896. doi: 10.1109/TIP.2022.3189811. Epub 2022 Jul 22.
5
A Spatio-Temporal Motion Network for Action Recognition Based on Spatial Attention.基于空间注意力的用于动作识别的时空运动网络
Entropy (Basel). 2022 Mar 4;24(3):368. doi: 10.3390/e24030368.
6
Multi-Scale Spatio-Temporal Memory Network for Lightweight Video Denoising.用于轻量级视频去噪的多尺度时空记忆网络
IEEE Trans Image Process. 2024;33:5810-5823. doi: 10.1109/TIP.2024.3444315. Epub 2024 Oct 15.
7
Temporal Segment Networks for Action Recognition in Videos.用于视频动作识别的时态片段网络
IEEE Trans Pattern Anal Mach Intell. 2019 Nov;41(11):2740-2755. doi: 10.1109/TPAMI.2018.2868668. Epub 2018 Sep 3.
8
Spatio-temporal Laplacian pyramid coding for action recognition.基于时空拉普拉斯金字塔的动作识别。
IEEE Trans Cybern. 2014 Jun;44(6):817-27. doi: 10.1109/TCYB.2013.2273174. Epub 2013 Jul 31.
9
Sequential Video VLAD: Training the Aggregation Locally and Temporally.序贯视频 VLAD:局部和时间聚合的训练。
IEEE Trans Image Process. 2018 Oct;27(10):4933-4944. doi: 10.1109/TIP.2018.2846664.
10
Spatial-Temporal Pyramid Graph Reasoning for Action Recognition.用于动作识别的时空金字塔图推理
IEEE Trans Image Process. 2022;31:5484-5497. doi: 10.1109/TIP.2022.3196175. Epub 2022 Aug 22.

引用本文的文献

1
Temporal-Spatial Redundancy Reduction in Video Sequences: A Motion-Based Entropy-Driven Attention Approach.视频序列中的时空冗余减少:一种基于运动的熵驱动注意力方法。
Biomimetics (Basel). 2025 Mar 21;10(4):192. doi: 10.3390/biomimetics10040192.