Li Chan, Huang Zhenye, Zou Wenxuan, Huang Haiping
PMI Lab, School of Physics, Sun Yat-sen University, Guangzhou 510275, People's Republic of China.
CAS Key Laboratory for Theoretical Physics, Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing 100190, People's Republic of China.
Phys Rev E. 2023 Jul;108(1-1):014309. doi: 10.1103/PhysRevE.108.014309.
An obstacle to artificial general intelligence is set by continual learning of multiple tasks of a different nature. Recently, various heuristic tricks, both from machine learning and from neuroscience angles, were proposed, but they lack a unified theory foundation. Here, we focus on continual learning in single-layered and multilayered neural networks of binary weights. A variational Bayesian learning setting is thus proposed in which the neural networks are trained in a field-space, rather than a gradient-ill-defined discrete-weight space, and furthermore, weight uncertainty is naturally incorporated, and it modulates synaptic resources among tasks. From a physics perspective, we translate variational continual learning into a Franz-Parisi thermodynamic potential framework, where previous task knowledge serves as a prior probability and a reference as well. We thus interpret the continual learning of the binary perceptron in a teacher-student setting as a Franz-Parisi potential computation. The learning performance can then be analytically studied with mean-field order parameters, whose predictions coincide with numerical experiments using stochastic gradient descent methods. Based on the variational principle and Gaussian field approximation of internal preactivations in hidden layers, we also derive the learning algorithm considering weight uncertainty, which solves the continual learning with binary weights using multilayered neural networks, and performs better than the currently available metaplasticity algorithm in which binary synapses bear hidden continuous states and the synaptic plasticity is modulated by a heuristic regularization function. Our proposed principled frameworks also connect to elastic weight consolidation, weight-uncertainty modulated learning, and neuroscience-inspired metaplasticity, providing a theoretically grounded method for real-world multitask learning with deep networks.
不同性质的多个任务的持续学习为通用人工智能设置了障碍。最近,从机器学习和神经科学的角度提出了各种启发式技巧,但它们缺乏统一的理论基础。在此,我们专注于二元权重的单层和多层神经网络中的持续学习。因此,提出了一种变分贝叶斯学习设置,其中神经网络在场空间中进行训练,而不是在梯度定义不明确的离散权重空间中,此外,权重不确定性被自然地纳入,并且它在任务之间调节突触资源。从物理学的角度来看,我们将变分持续学习转化为弗朗茨 - 帕里西热力学势框架,其中先前任务的知识既作为先验概率又作为参考。因此,我们将师生设置下二元感知器的持续学习解释为弗朗茨 - 帕里西势计算。然后可以使用平均场序参量对学习性能进行分析研究,其预测与使用随机梯度下降方法的数值实验结果一致。基于变分原理和隐藏层内部预激活的高斯场近似,我们还推导了考虑权重不确定性的学习算法,该算法使用多层神经网络解决了二元权重的持续学习问题,并且比目前可用的元可塑性算法表现更好,在元可塑性算法中二元突触具有隐藏的连续状态,并且突触可塑性由启发式正则化函数调制。我们提出的原理框架还与弹性权重巩固、权重不确定性调制学习以及受神经科学启发的元可塑性相关联,为使用深度网络进行现实世界的多任务学习提供了一种有理论依据的方法。