Yin Shao-Yu, Huang Yu, Chang Tien-Yu, Chang Shih-Fang, Tseng Vincent S
Department of Computer Science, National Yang Ming Chiao Tung University, Hsinchu, Taiwan, ROC.
Department of Computer Science, National Yang Ming Chiao Tung University, Hsinchu, Taiwan, ROC; Information and Communications Research Laboratories, Industrial Technology Research Institute Hsinchu, Taiwan, ROC.
Neural Netw. 2023 Jan;158:171-187. doi: 10.1016/j.neunet.2022.10.031. Epub 2022 Nov 11.
Continual learning is an emerging research branch of deep learning, which aims to learn a model for a series of tasks continually without forgetting knowledge obtained from previous tasks. Despite receiving a lot of attention in the research community, temporal-based continual learning techniques are still underutilized. In this paper, we address the problem of temporal-based continual learning by allowing a model to continuously learn on temporal data. To solve the catastrophic forgetting problem of learning temporal data in task incremental scenarios, in this research, we propose a novel method based on attentive recurrent neural networks, called Temporal Teacher Distillation (TTD). TTD solves the catastrophic forgetting problem in an attentive recurrent neural network based on three hypotheses, namely Rotation Hypothesis, Redundant Hypothesis, and Recover Hypothesis. Rotation Hypothesis and Redundant hypotheses could cause the attention shift phenomenon, which degrades the model performance on the learned tasks. Moreover, not considering the Recover Hypothesis increases extra memory usage in continuously training different tasks. Therefore, the proposed TTD based on the above hypotheses complements the inadequacy of the existing methods for temporal-based continual learning. For evaluating the performance of our proposed method in task incremental setting, we use a public dataset, WIreless Sensor Data Mining (WISDM), and a synthetic dataset, Split-QuickDraw-100. According to experimental results, the proposed TTD significantly outperforms state-of-the-art methods by up to 14.6% and 45.1% in terms of accuracy and forgetting measures, respectively. To the best of our knowledge, this is the first work that studies continual learning in real-world incremental categories for temporal data classification with attentive recurrent neural networks and provides the proper application-oriented scenario.
持续学习是深度学习中一个新兴的研究分支,其目的是持续学习一系列任务的模型,而不会忘记从先前任务中获得的知识。尽管在研究界受到了广泛关注,但基于时间的持续学习技术仍未得到充分利用。在本文中,我们通过允许模型在时间数据上持续学习来解决基于时间的持续学习问题。为了解决任务增量场景中学习时间数据的灾难性遗忘问题,在本研究中,我们提出了一种基于注意力循环神经网络的新方法,称为时间教师蒸馏(TTD)。TTD基于三个假设,即旋转假设、冗余假设和恢复假设,解决了注意力循环神经网络中的灾难性遗忘问题。旋转假设和冗余假设可能会导致注意力转移现象,从而降低模型在已学习任务上的性能。此外,不考虑恢复假设会增加连续训练不同任务时的额外内存使用。因此,基于上述假设提出的TTD弥补了现有基于时间的持续学习方法的不足。为了评估我们提出的方法在任务增量设置中的性能,我们使用了一个公共数据集,无线传感器数据挖掘(WISDM),以及一个合成数据集,Split-QuickDraw-100。根据实验结果,所提出的TTD在准确率和遗忘度量方面分别比现有最先进的方法显著高出14.6%和45.1%。据我们所知,这是第一项使用注意力循环神经网络研究真实世界增量类别中时间数据分类的持续学习并提供适当的面向应用场景的工作。