Suppr超能文献

基于动态图像网络的动作识别

Action Recognition with Dynamic Image Networks.

作者信息

Bilen Hakan, Fernando Basura, Gavves Efstratios, Vedaldi Andrea

出版信息

IEEE Trans Pattern Anal Mach Intell. 2018 Dec;40(12):2799-2813. doi: 10.1109/TPAMI.2017.2769085. Epub 2017 Nov 2.

Abstract

We introduce the concept of dynamic image, a novel compact representation of videos useful for video analysis, particularly in combination with convolutional neural networks (CNNs). A dynamic image encodes temporal data such as RGB or optical flow videos by using the concept of 'rank pooling'. The idea is to learn a ranking machine that captures the temporal evolution of the data and to use the parameters of the latter as a representation. We call the resulting representation dynamic image because it summarizes the video dynamics in addition to appearance. This powerful idea allows to convert any video to an image so that existing CNN models pre-trained with still images can be immediately extended to videos. We also present an efficient approximate rank pooling operator that runs two orders of magnitude faster than the standard ones with any loss in ranking performance and can be formulated as a CNN layer. To demonstrate the power of the representation, we introduce a novel four stream CNN architecture which can learn from RGB and optical flow frames as well as from their dynamic image representations. We show that the proposed network achieves state-of-the-art performance, 95.5 and 72.5 percent accuracy, in the UCF101 and HMDB51, respectively.

摘要

我们引入了动态图像的概念,这是一种用于视频分析的新型紧凑视频表示形式,尤其适用于与卷积神经网络(CNN)结合使用。动态图像通过使用“秩池化”概念对诸如RGB或光流视频等时间数据进行编码。其思路是学习一个捕捉数据时间演变的排序机器,并将后者的参数用作一种表示形式。我们将得到的这种表示形式称为动态图像,因为它除了外观之外还总结了视频动态。这个强大的概念允许将任何视频转换为图像,从而使那些用静态图像预训练的现有CNN模型能够立即扩展到视频。我们还提出了一种高效的近似秩池化算子,其运行速度比标准算子快两个数量级,且在排序性能上没有任何损失,并且可以被表述为一个CNN层。为了证明这种表示形式的强大之处,我们引入了一种新颖的四流CNN架构,它可以从RGB和光流帧以及它们的动态图像表示中进行学习。我们表明,所提出的网络在UCF101和HMDB51数据集上分别达到了95.5%和72.5%的准确率,取得了当前最优的性能。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验