Suppr超能文献

自适应去趋势以加速上下文视频识别的卷积门控循环单元训练。

Adaptive detrending to accelerate convolutional gated recurrent unit training for contextual video recognition.

机构信息

School of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea; Cognitive Neurorobotics Research Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan.

School of Computing, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea.

出版信息

Neural Netw. 2018 Sep;105:356-370. doi: 10.1016/j.neunet.2018.05.009. Epub 2018 May 22.

Abstract

Video image recognition has been extensively studied with rapid progress recently. However, most methods focus on short-term rather than long-term (contextual) video recognition. Convolutional recurrent neural networks (ConvRNNs) provide robust spatio-temporal information processing capabilities for contextual video recognition, but require extensive computation that slows down training. Inspired by normalization and detrending methods, in this paper we propose "adaptive detrending" (AD) for temporal normalization in order to accelerate the training of ConvRNNs, especially of convolutional gated recurrent unit (ConvGRU). For each neuron in a recurrent neural network (RNN), AD identifies the trending change within a sequence and subtracts it, removing the internal covariate shift. In experiments testing for contextual video recognition with ConvGRU, results show that (1) ConvGRU clearly outperforms feed-forward neural networks, (2) AD consistently and significantly accelerates training and improves generalization, (3) performance is further improved when AD is coupled with other normalization methods, and most importantly, (4) the more long-term contextual information is required, the more AD outperforms existing methods.

摘要

视频图像识别近年来得到了广泛的研究,并取得了快速的进展。然而,大多数方法都集中在短期(上下文)视频识别上,而不是长期。卷积递归神经网络(ConvRNN)为上下文视频识别提供了强大的时空信息处理能力,但需要大量的计算,从而降低了训练速度。受归一化和去趋势化方法的启发,本文提出了用于时间归一化的“自适应去趋势化”(AD),以加速 ConvRNN 的训练,特别是卷积门控循环单元(ConvGRU)的训练。对于循环神经网络(RNN)中的每个神经元,AD 识别序列中的趋势变化并减去它,从而消除内部协变量偏移。在使用 ConvGRU 进行上下文视频识别的实验中,结果表明:(1)ConvGRU 明显优于前馈神经网络;(2)AD 始终显著加速训练并提高泛化能力;(3)当 AD 与其他归一化方法结合使用时,性能得到进一步提高;(4)需要的上下文信息越长期,AD 的性能就越优于现有方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验