Kessler Samuel, Cobb Adam, Rudner Tim G J, Zohren Stefan, Roberts Stephen J
Department of Engineering Science, University of Oxford, Oxford OX2 6ED, UK.
SRI International, Arlington, VA 22209, USA.
Entropy (Basel). 2023 May 31;25(6):884. doi: 10.3390/e25060884.
Sequential Bayesian inference can be used for to prevent catastrophic forgetting of past tasks and provide an informative prior when learning new tasks. We revisit sequential Bayesian inference and assess whether using the previous task's posterior as a prior for a new task can prevent catastrophic forgetting in Bayesian neural networks. Our first contribution is to perform sequential Bayesian inference using Hamiltonian Monte Carlo. We propagate the posterior as a prior for new tasks by approximating the posterior via fitting a density estimator on Hamiltonian Monte Carlo samples. We find that this approach fails to prevent catastrophic forgetting, demonstrating the difficulty in performing sequential Bayesian inference in neural networks. From there, we study simple analytical examples of sequential Bayesian inference and CL and highlight the issue of model misspecification, which can lead to sub-optimal continual learning performance despite exact inference. Furthermore, we discuss how task data imbalances can cause forgetting. From these limitations, we argue that we need probabilistic models of the continual learning generative process rather than relying on sequential Bayesian inference over Bayesian neural network weights. Our final contribution is to propose a simple baseline called , which is competitive with the best performing Bayesian continual learning methods on class incremental continual learning computer vision benchmarks.
序列贝叶斯推理可用于防止对过去任务的灾难性遗忘,并在学习新任务时提供信息丰富的先验。我们重新审视序列贝叶斯推理,并评估将前一个任务的后验用作新任务的先验是否可以防止贝叶斯神经网络中的灾难性遗忘。我们的第一个贡献是使用哈密顿蒙特卡罗方法进行序列贝叶斯推理。我们通过在哈密顿蒙特卡罗样本上拟合密度估计器来近似后验,从而将后验作为新任务的先验进行传播。我们发现这种方法无法防止灾难性遗忘,这表明在神经网络中执行序列贝叶斯推理存在困难。从那里,我们研究了序列贝叶斯推理和持续学习的简单分析示例,并强调了模型错误指定的问题,尽管进行了精确推理,但这可能导致次优的持续学习性能。此外,我们讨论了任务数据不平衡如何导致遗忘。基于这些局限性,我们认为我们需要持续学习生成过程的概率模型,而不是依赖于对贝叶斯神经网络权重的序列贝叶斯推理。我们的最后一个贡献是提出一个名为 的简单基线,它在类增量持续学习计算机视觉基准测试中与性能最佳的贝叶斯持续学习方法具有竞争力。