Suppr超能文献

加速深度神经网络训练的低复杂度梯度计算技术

Low Complexity Gradient Computation Techniques to Accelerate Deep Neural Network Training.

作者信息

Shin Dongyeob, Kim Geonho, Jo Joongho, Park Jongsun

出版信息

IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):5745-5759. doi: 10.1109/TNNLS.2021.3130991. Epub 2023 Sep 1.

Abstract

Deep neural network (DNN) training is an iterative process of updating network weights, called gradient computation, where (mini-batch) stochastic gradient descent (SGD) algorithm is generally used. Since SGD inherently allows gradient computations with noise, the proper approximation of computing weight gradients within SGD noise can be a promising technique to save energy/time consumptions during DNN training. This article proposes two novel techniques to reduce the computational complexity of the gradient computations for the acceleration of SGD-based DNN training. First, considering that the output predictions of a network (confidence) change with training inputs, the relation between the confidence and the magnitude of the weight gradient can be exploited to skip the gradient computations without seriously sacrificing the accuracy, especially for high confidence inputs. Second, the angle diversity-based approximations of intermediate activations for weight gradient calculation are also presented. Based on the fact that the angle diversity of gradients is small (highly uncorrelated) in the early training epoch, the bit precision of activations can be reduced to 2-/4-/8-bit depending on the resulting angle error between the original gradient and quantized gradient. The simulations show that the proposed approach can skip up to 75.83% of gradient computations with negligible accuracy degradation for CIFAR-10 dataset using ResNet-20. Hardware implementation results using 65-nm CMOS technology also show that the proposed training accelerator achieves up to 1.69× energy efficiency compared with other training accelerators.

摘要

深度神经网络(DNN)训练是一个更新网络权重的迭代过程,称为梯度计算,通常使用(小批量)随机梯度下降(SGD)算法。由于SGD本质上允许进行带噪声的梯度计算,在SGD噪声范围内对计算权重梯度进行适当近似,可能是一种在DNN训练期间节省能量/时间消耗的有前景的技术。本文提出了两种新颖的技术,以降低梯度计算的计算复杂度,从而加速基于SGD的DNN训练。首先,考虑到网络的输出预测(置信度)会随训练输入而变化,可以利用置信度与权重梯度大小之间的关系来跳过梯度计算,而不会严重牺牲准确性,特别是对于高置信度输入。其次,还提出了基于角度分集的中间激活近似方法用于权重梯度计算。基于早期训练阶段梯度的角度分集较小(高度不相关)这一事实,根据原始梯度与量化梯度之间产生的角度误差,激活的比特精度可以降低到2位/4位/8位。仿真结果表明,对于使用ResNet-20的CIFAR-10数据集,所提出的方法可以跳过高达75.83%的梯度计算,且精度下降可忽略不计。使用65纳米CMOS技术的硬件实现结果还表明,与其他训练加速器相比,所提出的训练加速器实现了高达1.69倍的能源效率。

相似文献

1
Low Complexity Gradient Computation Techniques to Accelerate Deep Neural Network Training.
IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):5745-5759. doi: 10.1109/TNNLS.2021.3130991. Epub 2023 Sep 1.
2
Exploiting Retraining-Based Mixed-Precision Quantization for Low-Cost DNN Accelerator Design.
IEEE Trans Neural Netw Learn Syst. 2021 Jul;32(7):2925-2938. doi: 10.1109/TNNLS.2020.3008996. Epub 2021 Jul 6.
3
Accelerating DNN Training Through Selective Localized Learning.
Front Neurosci. 2022 Jan 11;15:759807. doi: 10.3389/fnins.2021.759807. eCollection 2021.
4
Enabling Training of Neural Networks on Noisy Hardware.
Front Artif Intell. 2021 Sep 9;4:699148. doi: 10.3389/frai.2021.699148. eCollection 2021.
5
Random and Adversarial Bit Error Robustness: Energy-Efficient and Secure DNN Accelerators.
IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):3632-3647. doi: 10.1109/TPAMI.2022.3181972.
6
ETA: An Efficient Training Accelerator for DNNs Based on Hardware-Algorithm Co-Optimization.
IEEE Trans Neural Netw Learn Syst. 2023 Oct;34(10):7660-7674. doi: 10.1109/TNNLS.2022.3145850. Epub 2023 Oct 5.
7
SmartDeal: Remodeling Deep Network Weights for Efficient Inference and Training.
IEEE Trans Neural Netw Learn Syst. 2023 Oct;34(10):7099-7113. doi: 10.1109/TNNLS.2021.3138056. Epub 2023 Oct 5.
8
Accelerating deep neural network training with inconsistent stochastic gradient descent.
Neural Netw. 2017 Sep;93:219-229. doi: 10.1016/j.neunet.2017.06.003. Epub 2017 Jun 16.
9
PID Controller-Based Stochastic Optimization Acceleration for Deep Neural Networks.
IEEE Trans Neural Netw Learn Syst. 2020 Dec;31(12):5079-5091. doi: 10.1109/TNNLS.2019.2963066. Epub 2020 Nov 30.
10
Early Termination Based Training Acceleration for an Energy-Efficient SNN Processor Design.
IEEE Trans Biomed Circuits Syst. 2022 Jun;16(3):442-455. doi: 10.1109/TBCAS.2022.3181808. Epub 2022 Jul 12.

引用本文的文献

1
Enhancing deep neural network training efficiency and performance through linear prediction.
Sci Rep. 2024 Jul 2;14(1):15197. doi: 10.1038/s41598-024-65691-0.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验