持续学习的统计力学：变分原理与平均场势

Statistical mechanics of continual learning: Variational principle and mean-field potential.

作者信息

Li Chan, Huang Zhenye, Zou Wenxuan, Huang Haiping

机构信息

PMI Lab, School of Physics, Sun Yat-sen University, Guangzhou 510275, People's Republic of China.

CAS Key Laboratory for Theoretical Physics, Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing 100190, People's Republic of China.

出版信息

Phys Rev E. 2023 Jul;108(1-1):014309. doi: 10.1103/PhysRevE.108.014309.

DOI:10.1103/PhysRevE.108.014309

PMID:37583230

Abstract

An obstacle to artificial general intelligence is set by continual learning of multiple tasks of a different nature. Recently, various heuristic tricks, both from machine learning and from neuroscience angles, were proposed, but they lack a unified theory foundation. Here, we focus on continual learning in single-layered and multilayered neural networks of binary weights. A variational Bayesian learning setting is thus proposed in which the neural networks are trained in a field-space, rather than a gradient-ill-defined discrete-weight space, and furthermore, weight uncertainty is naturally incorporated, and it modulates synaptic resources among tasks. From a physics perspective, we translate variational continual learning into a Franz-Parisi thermodynamic potential framework, where previous task knowledge serves as a prior probability and a reference as well. We thus interpret the continual learning of the binary perceptron in a teacher-student setting as a Franz-Parisi potential computation. The learning performance can then be analytically studied with mean-field order parameters, whose predictions coincide with numerical experiments using stochastic gradient descent methods. Based on the variational principle and Gaussian field approximation of internal preactivations in hidden layers, we also derive the learning algorithm considering weight uncertainty, which solves the continual learning with binary weights using multilayered neural networks, and performs better than the currently available metaplasticity algorithm in which binary synapses bear hidden continuous states and the synaptic plasticity is modulated by a heuristic regularization function. Our proposed principled frameworks also connect to elastic weight consolidation, weight-uncertainty modulated learning, and neuroscience-inspired metaplasticity, providing a theoretically grounded method for real-world multitask learning with deep networks.

摘要

不同性质的多个任务的持续学习为通用人工智能设置了障碍。最近，从机器学习和神经科学的角度提出了各种启发式技巧，但它们缺乏统一的理论基础。在此，我们专注于二元权重的单层和多层神经网络中的持续学习。因此，提出了一种变分贝叶斯学习设置，其中神经网络在场空间中进行训练，而不是在梯度定义不明确的离散权重空间中，此外，权重不确定性被自然地纳入，并且它在任务之间调节突触资源。从物理学的角度来看，我们将变分持续学习转化为弗朗茨 - 帕里西热力学势框架，其中先前任务的知识既作为先验概率又作为参考。因此，我们将师生设置下二元感知器的持续学习解释为弗朗茨 - 帕里西势计算。然后可以使用平均场序参量对学习性能进行分析研究，其预测与使用随机梯度下降方法的数值实验结果一致。基于变分原理和隐藏层内部预激活的高斯场近似，我们还推导了考虑权重不确定性的学习算法，该算法使用多层神经网络解决了二元权重的持续学习问题，并且比目前可用的元可塑性算法表现更好，在元可塑性算法中二元突触具有隐藏的连续状态，并且突触可塑性由启发式正则化函数调制。我们提出的原理框架还与弹性权重巩固、权重不确定性调制学习以及受神经科学启发的元可塑性相关联，为使用深度网络进行现实世界的多任务学习提供了一种有理论依据的方法。

相似文献

Statistical mechanics of continual learning: Variational principle and mean-field potential.持续学习的统计力学：变分原理与平均场势

Phys Rev E. 2023 Jul;108(1-1):014309. doi: 10.1103/PhysRevE.108.014309.

Variational Data-Free Knowledge Distillation for Continual Learning.用于持续学习的变分无数据知识蒸馏

IEEE Trans Pattern Anal Mach Intell. 2023 Oct;45(10):12618-12634. doi: 10.1109/TPAMI.2023.3271626. Epub 2023 Sep 5.

Variational mean-field theory for training restricted Boltzmann machines with binary synapses.用于训练具有二元突触的受限玻尔兹曼机的变分平均场理论。

Phys Rev E. 2020 Sep;102(3-1):030301. doi: 10.1103/PhysRevE.102.030301.

Bayesian continual learning spiking neural networks.贝叶斯持续学习脉冲神经网络。

Front Comput Neurosci. 2022 Nov 16;16:1037976. doi: 10.3389/fncom.2022.1037976. eCollection 2022.

Return of the normal distribution: Flexible deep continual learning with variational auto-encoders.正态分布的回归：基于变分自编码器的灵活深度持续学习

Neural Netw. 2022 Oct;154:397-412. doi: 10.1016/j.neunet.2022.07.016. Epub 2022 Jul 21.

Loss of plasticity in deep continual learning.深度学习中的可塑性丧失。

Nature. 2024 Aug;632(8026):768-774. doi: 10.1038/s41586-024-07711-7. Epub 2024 Aug 21.

Continual Learning Using Bayesian Neural Networks.贝叶斯神经网络的持续学习。

IEEE Trans Neural Netw Learn Syst. 2021 Sep;32(9):4243-4252. doi: 10.1109/TNNLS.2020.3017292. Epub 2021 Aug 31.

Continual learning with attentive recurrent neural networks for temporal data classification.用于时态数据分类的基于注意力循环神经网络的持续学习

Neural Netw. 2023 Jan;158:171-187. doi: 10.1016/j.neunet.2022.10.031. Epub 2022 Nov 11.

Task-Agnostic Continual Learning Using Online Variational Bayes With Fixed-Point Updates.使用定点更新的在线变分贝叶斯进行与任务无关的持续学习。

Neural Comput. 2021 Oct 12;33(11):3139-3177. doi: 10.1162/neco_a_01430.

Bio-inspired, task-free continual learning through activity regularization.受生物启发的、无需任务的通过活动正则化的持续学习。

Biol Cybern. 2023 Oct;117(4-5):345-361. doi: 10.1007/s00422-023-00973-w. Epub 2023 Aug 17.

引用本文的文献

Eight challenges in developing theory of intelligence.发展智力理论的八大挑战。

Front Comput Neurosci. 2024 Jul 24;18:1388166. doi: 10.3389/fncom.2024.1388166. eCollection 2024.

持续学习的统计力学：变分原理与平均场势

Statistical mechanics of continual learning: Variational principle and mean-field potential.

作者信息

Li Chan, Huang Zhenye, Zou Wenxuan, Huang Haiping

机构信息

PMI Lab, School of Physics, Sun Yat-sen University, Guangzhou 510275, People's Republic of China.

CAS Key Laboratory for Theoretical Physics, Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing 100190, People's Republic of China.

出版信息

Phys Rev E. 2023 Jul;108(1-1):014309. doi: 10.1103/PhysRevE.108.014309.

DOI:10.1103/PhysRevE.108.014309

PMID:37583230

Abstract

摘要

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

持续学习的统计力学：变分原理与平均场势

Statistical mechanics of continual learning: Variational principle and mean-field potential.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

持续学习的统计力学：变分原理与平均场势

Statistical mechanics of continual learning: Variational principle and mean-field potential.

作者信息

机构信息

出版信息

相似文献

引用本文的文献