Lyu Kejie, Li Yingming, Zhang Zhongfei
IEEE Trans Image Process. 2019 Oct 4. doi: 10.1109/TIP.2019.2944522.
Multi-task deep learning methods learn multiple tasks simultaneously and share representations amongst them, so information from related tasks improves learning within one task. The generalization capabilities of the produced models are substantially enhanced. Typical multi-task deep learning models usually share representations of different tasks in lower layers of the network, and separate representations of different tasks in higher layers. However, different groups of tasks always have different requirements for sharing representations, so the required design criterion does not necessarily guarantee that the obtained network architecture is optimal. In addition, most existing methods ignore the redundancy problem and lack the pre-screening process for representations before they are shared. Here, we propose a model called Attention-aware Multi-task Convolutional Neural Network, which automatically learns appropriate sharing through end-to-end training. The attention mechanism is introduced into our architecture to suppress redundant contents contained in the representations. The shortcut connection is adopted to preserve useful information. We evaluate our model by carrying out experiments on different task groups and different datasets. Our model demonstrates an improvement over existing techniques in many experiments, indicating the effectiveness and the robustness of the model. We also demonstrate the importance of attention mechanism and shortcut connection in our model.
多任务深度学习方法可同时学习多个任务并在它们之间共享表示,因此来自相关任务的信息可改善单个任务内的学习。所生成模型的泛化能力会得到显著增强。典型的多任务深度学习模型通常在网络的较低层共享不同任务的表示,并在较高层分离不同任务的表示。然而,不同的任务组对共享表示总是有不同的要求,因此所需的设计标准不一定能保证所获得的网络架构是最优的。此外,大多数现有方法忽略了冗余问题,并且在共享表示之前缺乏对其进行预筛选的过程。在此,我们提出一种名为注意力感知多任务卷积神经网络的模型,它通过端到端训练自动学习适当的共享方式。注意力机制被引入到我们的架构中以抑制表示中包含的冗余内容。采用捷径连接来保留有用信息。我们通过在不同任务组和不同数据集上进行实验来评估我们的模型。我们的模型在许多实验中都展示出优于现有技术的性能,表明了该模型的有效性和鲁棒性。我们还证明了注意力机制和捷径连接在我们模型中的重要性。