Zhang Pengfei, Xue Jianru, Lan Cuiling, Zeng Wenjun, Gao Zhanning, Zheng Nanning
IEEE Trans Image Process. 2019 Sep 2. doi: 10.1109/TIP.2019.2937724.
Recurrent neural networks (RNNs) are capable of modeling temporal dependencies of complex sequential data. In general, current available structures of RNNs tend to concentrate on controlling the contributions of current and previous information. However, the exploration of different importance levels of different elements within an input vector is always ignored. We propose a simple yet effective Element-wise-Attention Gate (EleAttG), which can be easily added to an RNN block (e.g. all RNN neurons in an RNN layer), to empower the RNN neurons to have attentiveness capability. For an RNN block, an EleAttG is used for adaptively modulating the input by assigning different levels of importance, i.e., attention, to each element/dimension of the input. We refer to an RNN block equipped with an EleAttG as an EleAtt-RNN block. Instead of modulating the input as a whole, the EleAttG modulates the input at fine granularity, i.e., element-wise, and the modulation is content adaptive. The proposed EleAttG, as an additional fundamental unit, is general and can be applied to any RNN structures, e.g., standard RNN, Long Short-Term Memory (LSTM), or Gated Recurrent Unit (GRU). We demonstrate the effectiveness of the proposed EleAtt-RNN by applying it to different tasks including the action recognition, from both skeleton-based data and RGB videos, gesture recognition, and sequential MNIST classification. Experiments show that adding attentiveness through EleAttGs to RNN blocks significantly improves the power of RNNs.
循环神经网络(RNN)能够对复杂序列数据的时间依赖性进行建模。一般来说,当前可用的RNN结构倾向于专注于控制当前信息和先前信息的贡献。然而,输入向量中不同元素的不同重要性水平的探索总是被忽略。我们提出了一种简单而有效的逐元素注意力门(EleAttG),它可以很容易地添加到RNN块(例如RNN层中的所有RNN神经元)中,以使RNN神经元具有注意力能力。对于一个RNN块,EleAttG用于通过为输入的每个元素/维度分配不同级别的重要性(即注意力)来自适应地调制输入。我们将配备EleAttG的RNN块称为EleAtt-RNN块。EleAttG不是对输入进行整体调制,而是以精细粒度(即逐元素)对输入进行调制,并且这种调制是内容自适应的。所提出的EleAttG作为一个额外的基本单元,具有通用性,可以应用于任何RNN结构,例如标准RNN、长短期记忆(LSTM)或门控循环单元(GRU)。我们通过将其应用于不同任务,包括基于骨架的数据和RGB视频的动作识别、手势识别以及序列MNIST分类,来证明所提出的EleAtt-RNN的有效性。实验表明,通过EleAttG为RNN块添加注意力显著提高了RNN的能力。