Zeno Chen, Golan Itay, Hoffer Elad, Soudry Daniel
Department of Electrical Engineering, Technion, Israel Institute of Technology, Haifa 3299993, Israel
Habana-Labs, Caesarea 3079821, Israel
Neural Comput. 2021 Oct 12;33(11):3139-3177. doi: 10.1162/neco_a_01430.
Catastrophic forgetting is the notorious vulnerability of neural networks to the changes in the data distribution during learning. This phenomenon has long been considered a major obstacle for using learning agents in realistic continual learning settings. A large body of continual learning research assumes that task boundaries are known during training. However, only a few works consider scenarios in which task boundaries are unknown or not well defined: task-agnostic scenarios. The optimal Bayesian solution for this requires an intractable online Bayes update to the weights posterior. We aim to approximate the online Bayes update as accurately as possible. To do so, we derive novel fixed-point equations for the online variational Bayes optimization problem for multivariate gaussian parametric distributions. By iterating the posterior through these fixed-point equations, we obtain an algorithm (FOO-VB) for continual learning that can handle nonstationary data distribution using a fixed architecture and without using external memory (i.e., without access to previous data). We demonstrate that our method (FOO-VB) outperforms existing methods in task-agnostic scenarios. FOO-VB Pytorch implementation is available at https://github.com/chenzeno/FOO-VB.
灾难性遗忘是神经网络在学习过程中对数据分布变化的一种广为人知的脆弱性。长期以来,这种现象一直被认为是在现实的持续学习环境中使用学习智能体的主要障碍。大量的持续学习研究假设在训练期间任务边界是已知的。然而,只有少数工作考虑任务边界未知或定义不明确的场景:即与任务无关的场景。对此,最优的贝叶斯解决方案需要对权重后验进行难以处理的在线贝叶斯更新。我们的目标是尽可能准确地逼近在线贝叶斯更新。为此,我们针对多元高斯参数分布的在线变分贝叶斯优化问题推导了新的不动点方程。通过这些不动点方程迭代后验,我们得到了一种用于持续学习的算法(FOO-VB),该算法可以使用固定架构且无需外部记忆(即无需访问先前数据)来处理非平稳数据分布。我们证明了我们的方法(FOO-VB)在与任务无关的场景中优于现有方法。FOO-VB的Pytorch实现可在https://github.com/chenzeno/FOO-VB获取。