用于持续学习的变分无数据知识蒸馏

Variational Data-Free Knowledge Distillation for Continual Learning.

作者信息

Li Xiaorong, Wang Shipeng, Sun Jian, Xu Zongben

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 Oct;45(10):12618-12634. doi: 10.1109/TPAMI.2023.3271626. Epub 2023 Sep 5.

DOI:10.1109/TPAMI.2023.3271626

Abstract

Deep neural networks suffer from catastrophic forgetting when trained on sequential tasks in continual learning. Various methods rely on storing data of previous tasks to mitigate catastrophic forgetting, which is prohibited in real-world applications considering privacy and security issues. In this paper, we consider a realistic setting of continual learning, where training data of previous tasks are unavailable and memory resources are limited. We contribute a novel knowledge distillation-based method in an information-theoretic framework by maximizing mutual information between outputs of previously learned and current networks. Due to the intractability of computation of mutual information, we instead maximize its variational lower bound, where the covariance of variational distribution is modeled by a graph convolutional network. The inaccessibility of data of previous tasks is tackled by Taylor expansion, yielding a novel regularizer in network training loss for continual learning. The regularizer relies on compressed gradients of network parameters. It avoids storing previous task data and previously learned networks. Additionally, we employ self-supervised learning technique for learning effective features, which improves the performance of continual learning. We conduct extensive experiments including image classification and semantic segmentation, and the results show that our method achieves state-of-the-art performance on continual learning benchmarks.

摘要

深度神经网络在持续学习中对序列任务进行训练时会遭受灾难性遗忘。各种方法依赖于存储先前任务的数据来减轻灾难性遗忘，然而考虑到隐私和安全问题，这在实际应用中是被禁止的。在本文中，我们考虑了一种现实的持续学习场景，即先前任务的训练数据不可用且内存资源有限。我们在信息论框架下提出了一种基于知识蒸馏的新颖方法，通过最大化先前学习网络和当前网络输出之间的互信息来实现。由于互信息计算的难处理性，我们转而最大化其变分下界，其中变分分布的协方差由图卷积网络建模。通过泰勒展开解决了先前任务数据不可访问的问题，从而在持续学习的网络训练损失中产生了一种新颖的正则化器。该正则化器依赖于网络参数的压缩梯度。它避免了存储先前任务数据和先前学习的网络。此外，我们采用自监督学习技术来学习有效特征，这提高了持续学习的性能。我们进行了包括图像分类和语义分割在内的广泛实验，结果表明我们的方法在持续学习基准测试中达到了当前最优性能。

相似文献

Variational Data-Free Knowledge Distillation for Continual Learning.

IEEE Trans Pattern Anal Mach Intell. 2023 Oct;45(10):12618-12634. doi: 10.1109/TPAMI.2023.3271626. Epub 2023 Sep 5.

Continual learning with attentive recurrent neural networks for temporal data classification.

Neural Netw. 2023 Jan;158:171-187. doi: 10.1016/j.neunet.2022.10.031. Epub 2022 Nov 11.

Convolutional Neural Network With Developmental Memory for Continual Learning.

IEEE Trans Neural Netw Learn Syst. 2021 Jun;32(6):2691-2705. doi: 10.1109/TNNLS.2020.3007548. Epub 2021 Jun 2.

Subspace distillation for continual learning.

Neural Netw. 2023 Oct;167:65-79. doi: 10.1016/j.neunet.2023.07.047. Epub 2023 Aug 6.

Encoding primitives generation policy learning for robotic arm to overcome catastrophic forgetting in sequential multi-tasks learning.

Neural Netw. 2020 Sep;129:163-173. doi: 10.1016/j.neunet.2020.06.003. Epub 2020 Jun 5.

Triple-Memory Networks: A Brain-Inspired Method for Continual Learning.

IEEE Trans Neural Netw Learn Syst. 2022 May;33(5):1925-1934. doi: 10.1109/TNNLS.2021.3111019. Epub 2022 May 2.

Self-Net: Lifelong Learning via Continual Self-Modeling.

Front Artif Intell. 2020 Apr 9;3:19. doi: 10.3389/frai.2020.00019. eCollection 2020.

CCSI: Continual Class-Specific Impression for data-free class incremental learning.

Med Image Anal. 2024 Oct;97:103239. doi: 10.1016/j.media.2024.103239. Epub 2024 Jun 15.

Privacy-Preserving Synthetic Continual Semantic Segmentation for Robotic Surgery.

IEEE Trans Med Imaging. 2024 Jun;43(6):2291-2302. doi: 10.1109/TMI.2024.3364969. Epub 2024 Jun 3.

Adversarial Feature Alignment: Avoid Catastrophic Forgetting in Incremental Task Lifelong Learning.

Neural Comput. 2019 Nov;31(11):2266-2291. doi: 10.1162/neco_a_01232. Epub 2019 Sep 16.

引用本文的文献

Data free knowledge distillation with feature synthesis and spatial consistency for image analysis.

Sci Rep. 2024 Nov 11;14(1):27557. doi: 10.1038/s41598-024-78757-w.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于持续学习的变分无数据知识蒸馏

Variational Data-Free Knowledge Distillation for Continual Learning.

作者信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献