Suppr超能文献

BQN:基于运动带通模块实现动作识别的忙碌-安静网络

BQN: Busy-Quiet Net Enabled by Motion Band-Pass Module for Action Recognition.

作者信息

Huang Guoxi, Bors Adrian G

出版信息

IEEE Trans Image Process. 2022;31:4966-4979. doi: 10.1109/TIP.2022.3189810. Epub 2022 Aug 1.

Abstract

A rich video data representation can be realized by means of spatio-temporal frequency analysis. In this research study we show that a video can be disentangled, following the learning of video characteristics according to their spatio-temporal properties, into two complementary information components, dubbed Busy and Quiet. The Busy information characterizes the boundaries of moving regions, moving objects, or regions of change in movement. Meanwhile, the Quiet information encodes global smooth spatio-temporal structures defined by substantial redundancy. We design a trainable Motion Band-Pass Module (MBPM) for separating Busy and Quiet-defined information, in raw video data. We model a Busy-Quiet Net (BQN) by embedding the MBPM into a two-pathway CNN architecture. The efficiency of BQN is determined by avoiding redundancy in the feature spaces defined by the two pathways. While one pathway processes the Busy features, the other processes Quiet features at lower spatio-temporal resolutions reducing both memory and computational costs. Through experiments we show that the proposed MBPM can be used as a plug-in module in various CNN backbone architectures, significantly boosting their performance. The proposed BQN is shown to outperform many recent video models on Something-Something V1, Kinetics400, UCF101 and HMDB51 datasets.

摘要

通过时空频率分析可以实现丰富的视频数据表示。在本研究中,我们表明,在根据视频的时空特性学习视频特征之后,视频可以被分解为两个互补的信息成分,分别称为“忙碌”和“安静”。“忙碌”信息表征移动区域、移动物体或运动变化区域的边界。同时,“安静”信息对由大量冗余定义的全局平滑时空结构进行编码。我们设计了一个可训练的运动带通模块(MBPM),用于在原始视频数据中分离由“忙碌”和“安静”定义的信息。我们通过将MBPM嵌入双路径CNN架构来构建一个“忙碌-安静”网络(BQN)。BQN的效率取决于避免由两条路径定义的特征空间中的冗余。当一条路径处理“忙碌”特征时,另一条路径以较低的时空分辨率处理“安静”特征,从而降低内存和计算成本。通过实验我们表明,所提出的MBPM可以用作各种CNN骨干架构中的插件模块,显著提高其性能。在Something-Something V1、Kinetics400、UCF101和HMDB51数据集上,所提出的BQN表现优于许多近期的视频模型。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验