Suppr超能文献

基于能量的深度神经网络局部学习的有效方法与框架。

Effective methods and framework for energy-based local learning of deep neural networks.

作者信息

Chen Haibo, Yang Bangcheng, He Fucun, Zhou Fei, Chen Shuai, Wu Chunpeng, Li Fan, Chua Yansong

机构信息

China Nanhu Academy of Electronics and Information Technology, Jiaxing, China.

China Electric Power Research Institute, Beijing, China.

出版信息

Front Artif Intell. 2025 Aug 26;8:1605706. doi: 10.3389/frai.2025.1605706. eCollection 2025.

Abstract

From a neuroscience perspective, artificial neural networks are regarded as abstract models of biological neurons, yet they rely on biologically implausible backpropagation for training. Energy-based models represent a class of brain-inspired learning frameworks that adjust system states by minimizing an energy function. Predictive coding (PC), a theoretical model within energy-based models, constructs its energy function from forward prediction errors, with optimization achieved by minimizing local layered errors. Owing to its local plasticity, PC emerges as the most promising alternative to backpropagation. However, PC face gradient explosion and vanishing challenges in deep networks with multiple layers. Gradient explosion occurs when layer-wise prediction errors are excessively large, while gradient vanishing arises when they are excessively small. To address these challenges, we propose bidirectional energy to stabilize prediction errors and mitigate gradient explosion, while using skip connections to resolve gradient vanishing problems. We also introduce a layer-adaptive learning rate (LALR) to enhance training efficiency. Our model achieves accuracies of 99.22% on MNIST, 93.78% on CIFAR-10, 83.96% on CIFAR-100, and 73.35% on Tiny ImageNet, comparable to the performance of identically structed networks trained with backprop. Finally, we developed a Jax-based framework for efficient training of energy-based models, reducing training time by half compared to PyTorch.

摘要

从神经科学的角度来看,人工神经网络被视为生物神经元的抽象模型,但它们在训练时依赖于生物学上不合理的反向传播。基于能量的模型代表了一类受大脑启发的学习框架,通过最小化能量函数来调整系统状态。预测编码(PC)是基于能量的模型中的一种理论模型,它根据前向预测误差构建其能量函数,并通过最小化局部层误差来实现优化。由于其局部可塑性,PC成为反向传播最有前途的替代方案。然而,PC在具有多层的深度网络中面临梯度爆炸和梯度消失的挑战。当逐层预测误差过大时会发生梯度爆炸,而当它们过小时会出现梯度消失。为了解决这些挑战,我们提出双向能量来稳定预测误差并减轻梯度爆炸,同时使用跳跃连接来解决梯度消失问题。我们还引入了层自适应学习率(LALR)来提高训练效率。我们的模型在MNIST上的准确率达到99.22%,在CIFAR-10上达到93.78%,在CIFAR-100上达到83.96%,在Tiny ImageNet上达到73.35%,与使用反向传播训练的相同结构网络的性能相当。最后,我们开发了一个基于Jax的框架,用于高效训练基于能量的模型,与PyTorch相比,训练时间减少了一半。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/60be/12418518/11c6fc5227a2/frai-08-1605706-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验