SSGD：用于无偏深度神经网络剪枝的稀疏性促进随机梯度下降算法

SSGD: SPARSITY-PROMOTING STOCHASTIC GRADIENT DESCENT ALGORITHM FOR UNBIASED DNN PRUNING.

作者信息

Lee Ching-Hua, Fedorov Igor, Rao Bhaskar D, Garudadri Harinath

机构信息

Department of ECE, University of California, San Diego.

ARM ML Research.

出版信息

Proc IEEE Int Conf Acoust Speech Signal Process. 2020 May;2020:5410-5414. doi: 10.1109/icassp40776.2020.9054436. Epub 2020 May 14.

DOI:10.1109/icassp40776.2020.9054436

PMID:33162834

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7643773/

Abstract

While deep neural networks (DNNs) have achieved state-of-the-art results in many fields, they are typically over-parameterized. Parameter redundancy, in turn, leads to inefficiency. Sparse signal recovery (SSR) techniques, on the other hand, find compact solutions to overcomplete linear problems. Therefore, a logical step is to draw the connection between SSR and DNNs. In this paper, we explore the application of iterative reweighting methods popular in SSR to learning efficient DNNs. By efficient, we mean sparse networks that require less computation and storage than the original, dense network. We propose a reweighting framework to learn sparse connections within a given architecture without biasing the optimization process, by utilizing the affine scaling transformation strategy. The resulting algorithm, referred to as Sparsity-promoting Stochastic Gradient Descent (SSGD), has simple gradient-based updates which can be easily implemented in existing deep learning libraries. We demonstrate the sparsification ability of SSGD on image classification tasks and show that it outperforms existing methods on the MNIST and CIFAR-10 datasets.

摘要

虽然深度神经网络（DNN）在许多领域都取得了领先的成果，但它们通常参数过多。参数冗余进而导致效率低下。另一方面，稀疏信号恢复（SSR）技术能找到超完备线性问题的紧凑解决方案。因此，一个合理的步骤是建立SSR和DNN之间的联系。在本文中，我们探索将SSR中流行的迭代重加权方法应用于学习高效的DNN。所谓高效，我们指的是稀疏网络，它比原始的密集网络需要更少的计算和存储。我们提出了一个重加权框架，通过利用仿射缩放变换策略，在给定架构内学习稀疏连接，而不会对优化过程产生偏差。由此产生的算法称为稀疏促进随机梯度下降（SSGD），它具有基于梯度的简单更新，可轻松在现有的深度学习库中实现。我们在图像分类任务上展示了SSGD的稀疏化能力，并表明它在MNIST和CIFAR - 10数据集上优于现有方法。

相似文献

SSGD: SPARSITY-PROMOTING STOCHASTIC GRADIENT DESCENT ALGORITHM FOR UNBIASED DNN PRUNING.

Proc IEEE Int Conf Acoust Speech Signal Process. 2020 May;2020:5410-5414. doi: 10.1109/icassp40776.2020.9054436. Epub 2020 May 14.

A Sparse Conjugate Gradient Adaptive Filter.

IEEE Signal Process Lett. 2020;27:1000-1004. doi: 10.1109/LSP.2020.3000459. Epub 2020 Jun 5.

CRESPR: Modular sparsification of DNNs to improve pruning performance and model interpretability.

Neural Netw. 2024 Apr;172:106067. doi: 10.1016/j.neunet.2023.12.021. Epub 2023 Dec 17.

Proportionate Adaptive Filtering Algorithms Derived Using an Iterative Reweighting Framework.

IEEE/ACM Trans Audio Speech Lang Process. 2021;29:171-186. doi: 10.1109/taslp.2020.3038526. Epub 2020 Nov 17.

Consistent Sparse Deep Learning: Theory and Computation.

J Am Stat Assoc. 2022;117(540):1981-1995. doi: 10.1080/01621459.2021.1895175. Epub 2021 Apr 20.

Transformed ℓ regularization for learning sparse deep neural networks.

Neural Netw. 2019 Nov;119:286-298. doi: 10.1016/j.neunet.2019.08.015. Epub 2019 Aug 27.

StructADMM: Achieving Ultrahigh Efficiency in Structured Pruning for DNNs.

IEEE Trans Neural Netw Learn Syst. 2022 May;33(5):2259-2273. doi: 10.1109/TNNLS.2020.3045153. Epub 2022 May 2.

Reweighted Alternating Direction Method of Multipliers for DNN weight pruning.

Neural Netw. 2024 Nov;179:106534. doi: 10.1016/j.neunet.2024.106534. Epub 2024 Jul 14.

Nonconvex Sparse Regularization for Deep Neural Networks and Its Optimality.

Neural Comput. 2022 Jan 14;34(2):476-517. doi: 10.1162/neco_a_01457.

Coarse-Grained Pruning of Neural Network Models Based on Blocky Sparse Structure.

Entropy (Basel). 2021 Aug 13;23(8):1042. doi: 10.3390/e23081042.

引用本文的文献

A Sparse Conjugate Gradient Adaptive Filter.

IEEE Signal Process Lett. 2020;27:1000-1004. doi: 10.1109/LSP.2020.3000459. Epub 2020 Jun 5.

本文引用的文献

Jointly Leveraging Decorrelation and Sparsity for Improved Feedback Cancellation in Hearing Aids.

Proc Eur Signal Process Conf EUSIPCO. 2020;2020:121-125. doi: 10.23919/eusipco47968.2020.9287330. Epub 2020 Dec 18.

Proportionate Adaptive Filters Based on Minimizing Diversity Measures for Promoting Sparsity.

Conf Rec Asilomar Conf Signals Syst Comput. 2019 Nov;2019:769-773. doi: 10.1109/ieeeconf44664.2019.9048716. Epub 2020 Mar 30.

A Wearable, Extensible, Open-Source Platform for Hearing Healthcare Research.

IEEE Access. 2019;7:162083-162101. doi: 10.1109/access.2019.2951145. Epub 2019 Nov 4.

Supervised Speech Separation Based on Deep Learning: An Overview.

IEEE/ACM Trans Audio Speech Lang Process. 2018 Oct;26(10):1702-1726. doi: 10.1109/TASLP.2018.2842159. Epub 2018 May 30.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

SSGD：用于无偏深度神经网络剪枝的稀疏性促进随机梯度下降算法

SSGD: SPARSITY-PROMOTING STOCHASTIC GRADIENT DESCENT ALGORITHM FOR UNBIASED DNN PRUNING.

作者信息

Lee Ching-Hua, Fedorov Igor, Rao Bhaskar D, Garudadri Harinath

机构信息

Department of ECE, University of California, San Diego.

ARM ML Research.

出版信息

Proc IEEE Int Conf Acoust Speech Signal Process. 2020 May;2020:5410-5414. doi: 10.1109/icassp40776.2020.9054436. Epub 2020 May 14.

DOI:10.1109/icassp40776.2020.9054436

PMID:33162834

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7643773/

Abstract

摘要

SSGD：用于无偏深度神经网络剪枝的稀疏性促进随机梯度下降算法

SSGD: SPARSITY-PROMOTING STOCHASTIC GRADIENT DESCENT ALGORITHM FOR UNBIASED DNN PRUNING.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

SSGD：用于无偏深度神经网络剪枝的稀疏性促进随机梯度下降算法

SSGD: SPARSITY-PROMOTING STOCHASTIC GRADIENT DESCENT ALGORITHM FOR UNBIASED DNN PRUNING.

作者信息

机构信息

出版信息