用于分布式深度学习的学习梯度压缩

Abrahamyan Lusine, Chen Yiming, Bekoulis Giannis, Deligiannis Nikos

IEEE Trans Neural Netw Learn Syst. 2022 Dec;33(12):7330-7344. doi: 10.1109/TNNLS.2021.3084806. Epub 2022 Nov 30.

Training deep neural networks on large datasets containing high-dimensional data requires a large amount of computation. A solution to this problem is data-parallel distributed training, where a model is replicated into several computational nodes that have access to different chunks of the data. This approach, however, entails high communication rates and latency because of the computed gradients that need to be shared among nodes at every iteration. The problem becomes more pronounced in the case that there is wireless communication between the nodes (i.e., due to the limited network bandwidth). To address this problem, various compression methods have been proposed, including sparsification, quantization, and entropy encoding of the gradients. Existing methods leverage the intra-node information redundancy, that is, they compress gradients at each node independently. In contrast, we advocate that the gradients across the nodes are correlated and propose methods to leverage this inter-node redundancy to improve compression efficiency. Depending on the node communication protocol (parameter server or ring-allreduce), we propose two instances for the gradient compression that we coin Learned Gradient Compression (LGC). Our methods exploit an autoencoder (i.e., trained during the first stages of the distributed training) to capture the common information that exists in the gradients of the distributed nodes. To constrain the nodes' computational complexity, the autoencoder is realized with a lightweight neural network. We have tested our LGC methods on the image classification and semantic segmentation tasks using different convolutional neural networks (CNNs) [ResNet50, ResNet101, and pyramid scene parsing network (PSPNet)] and multiple datasets (ImageNet, Cifar10, and CamVid). The ResNet101 model trained for image classification on Cifar10 achieved significant compression rate reductions with the accuracy of 93.57%, which is lower than the baseline distributed training with uncompressed gradients only by 0.18%. The rate of the model is reduced by 8095× and 8× compared with the baseline and the state-of-the-art deep gradient compression (DGC) method, respectively.

在包含高维数据的大型数据集上训练深度神经网络需要大量计算。解决此问题的一种方法是数据并行分布式训练，即将模型复制到几个可访问不同数据块的计算节点中。然而，由于每次迭代时需要在节点之间共享计算出的梯度，这种方法会带来高通信速率和延迟。在节点之间存在无线通信的情况下（即由于网络带宽有限），问题会变得更加突出。为了解决这个问题，已经提出了各种压缩方法，包括梯度的稀疏化、量化和熵编码。现有方法利用节点内信息冗余，即它们在每个节点上独立压缩梯度。相比之下，我们主张节点间的梯度是相关的，并提出利用这种节点间冗余来提高压缩效率的方法。根据节点通信协议（参数服务器或环形全规约），我们提出了两种梯度压缩实例，我们将其称为学习梯度压缩（LGC）。我们的方法利用自动编码器（即在分布式训练的第一阶段进行训练）来捕获分布式节点梯度中存在的共同信息。为了限制节点的计算复杂度，自动编码器由一个轻量级神经网络实现。我们使用不同的卷积神经网络（CNN）[ResNet50、ResNet101和金字塔场景解析网络（PSPNet）]和多个数据集（ImageNet、Cifar10和CamVid）在图像分类和语义分割任务上测试了我们的LGC方法。在Cifar10上训练用于图像分类的ResNet101模型实现了显著的压缩率降低，准确率为93.57%，仅比仅使用未压缩梯度的基线分布式训练低0.18%。与基线和当前最先进的深度梯度压缩（DGC）方法相比，该模型的速率分别降低了8095倍和8倍。

相似文献

Learned Gradient Compression for Distributed Deep Learning.

IEEE Trans Neural Netw Learn Syst. 2022 Dec;33(12):7330-7344. doi: 10.1109/TNNLS.2021.3084806. Epub 2022 Nov 30.

MedQ: Lossless ultra-low-bit neural network quantization for medical image segmentation.

Med Image Anal. 2021 Oct;73:102200. doi: 10.1016/j.media.2021.102200. Epub 2021 Aug 2.

Deep compressive autoencoder for action potential compression in large-scale neural recording.

J Neural Eng. 2018 Dec;15(6):066019. doi: 10.1088/1741-2552/aae18d. Epub 2018 Sep 14.

A Partition Based Gradient Compression Algorithm for Distributed Training in AIoT.

Sensors (Basel). 2021 Mar 10;21(6):1943. doi: 10.3390/s21061943.

ACSL: Adaptive correlation-driven sparsity learning for deep neural network compression.

Neural Netw. 2021 Dec;144:465-477. doi: 10.1016/j.neunet.2021.09.012. Epub 2021 Sep 16.

GenCoder: A Novel Convolutional Neural Network Based Autoencoder for Genomic Sequence Data Compression.

IEEE/ACM Trans Comput Biol Bioinform. 2024 May-Jun;21(3):405-415. doi: 10.1109/TCBB.2024.3366240. Epub 2024 Jun 5.

Robust and Communication-Efficient Federated Learning From Non-i.i.d. Data.

IEEE Trans Neural Netw Learn Syst. 2020 Sep;31(9):3400-3413. doi: 10.1109/TNNLS.2019.2944481. Epub 2019 Nov 1.

CBN-VAE: A Data Compression Model with Efficient Convolutional Structure for Wireless Sensor Networks.

Sensors (Basel). 2019 Aug 7;19(16):3445. doi: 10.3390/s19163445.

Weak sub-network pruning for strong and efficient neural networks.

Neural Netw. 2021 Dec;144:614-626. doi: 10.1016/j.neunet.2021.09.015. Epub 2021 Sep 30.

Biologically motivated learning method for deep neural networks using hierarchical competitive learning.

Neural Netw. 2021 Dec;144:271-278. doi: 10.1016/j.neunet.2021.08.027. Epub 2021 Sep 3.

引用本文的文献

A Transformer-Based Ensemble Framework for the Prediction of Protein-Protein Interaction Sites.

Research (Wash D C). 2023 Sep 27;6:0240. doi: 10.34133/research.0240. eCollection 2023.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

Learned Gradient Compression for Distributed Deep Learning.

IEEE Trans Neural Netw Learn Syst. 2022 Dec;33(12):7330-7344. doi: 10.1109/TNNLS.2021.3084806. Epub 2022 Nov 30.

MedQ: Lossless ultra-low-bit neural network quantization for medical image segmentation.

Med Image Anal. 2021 Oct;73:102200. doi: 10.1016/j.media.2021.102200. Epub 2021 Aug 2.

Deep compressive autoencoder for action potential compression in large-scale neural recording.

J Neural Eng. 2018 Dec;15(6):066019. doi: 10.1088/1741-2552/aae18d. Epub 2018 Sep 14.

A Partition Based Gradient Compression Algorithm for Distributed Training in AIoT.

Sensors (Basel). 2021 Mar 10;21(6):1943. doi: 10.3390/s21061943.

ACSL: Adaptive correlation-driven sparsity learning for deep neural network compression.

Neural Netw. 2021 Dec;144:465-477. doi: 10.1016/j.neunet.2021.09.012. Epub 2021 Sep 16.

GenCoder: A Novel Convolutional Neural Network Based Autoencoder for Genomic Sequence Data Compression.

IEEE/ACM Trans Comput Biol Bioinform. 2024 May-Jun;21(3):405-415. doi: 10.1109/TCBB.2024.3366240. Epub 2024 Jun 5.

Robust and Communication-Efficient Federated Learning From Non-i.i.d. Data.

IEEE Trans Neural Netw Learn Syst. 2020 Sep;31(9):3400-3413. doi: 10.1109/TNNLS.2019.2944481. Epub 2019 Nov 1.

CBN-VAE: A Data Compression Model with Efficient Convolutional Structure for Wireless Sensor Networks.

Sensors (Basel). 2019 Aug 7;19(16):3445. doi: 10.3390/s19163445.

Weak sub-network pruning for strong and efficient neural networks.

Neural Netw. 2021 Dec;144:614-626. doi: 10.1016/j.neunet.2021.09.015. Epub 2021 Sep 30.

Biologically motivated learning method for deep neural networks using hierarchical competitive learning.

Neural Netw. 2021 Dec;144:271-278. doi: 10.1016/j.neunet.2021.08.027. Epub 2021 Sep 3.

引用本文的文献

A Transformer-Based Ensemble Framework for the Prediction of Protein-Protein Interaction Sites.

Research (Wash D C). 2023 Sep 27;6:0240. doi: 10.34133/research.0240. eCollection 2023.

Learned Gradient Compression for Distributed Deep Learning.

作者信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献