Suppr超能文献

一致稀疏深度学习:理论与计算

Consistent Sparse Deep Learning: Theory and Computation.

作者信息

Sun Yan, Song Qifan, Liang Faming

机构信息

Department of Statistics, Purdue University, West Lafayette, IN 47907.

出版信息

J Am Stat Assoc. 2022;117(540):1981-1995. doi: 10.1080/01621459.2021.1895175. Epub 2021 Apr 20.

Abstract

Deep learning has been the engine powering many successes of data science. However, the deep neural network (DNN), as the basic model of deep learning, is often excessively over-parameterized, causing many difficulties in training, prediction and interpretation. We propose a frequentist-like method for learning sparse DNNs and justify its consistency under the Bayesian framework: the proposed method could learn a sparse DNN with at most (/log()) connections and nice theoretical guarantees such as posterior consistency, variable selection consistency and asymptotically optimal generalization bounds. In particular, we establish posterior consistency for the sparse DNN with a mixture Gaussian prior, show that the structure of the sparse DNN can be consistently determined using a Laplace approximation-based marginal posterior inclusion probability approach, and use Bayesian evidence to elicit sparse DNNs learned by an optimization method such as stochastic gradient descent in multiple runs with different initializations. The proposed method is computationally more efficient than standard Bayesian methods for large-scale sparse DNNs. The numerical results indicate that the proposed method can perform very well for large-scale network compression and high-dimensional nonlinear variable selection, both advancing interpretable machine learning.

摘要

深度学习一直是推动数据科学诸多成功的引擎。然而,深度神经网络(DNN)作为深度学习的基本模型,往往参数过度冗余,在训练、预测和解释方面带来诸多困难。我们提出一种类似频率学派的方法来学习稀疏DNN,并在贝叶斯框架下证明其一致性:该方法能够学习最多具有(/log())连接的稀疏DNN,并具有诸如后验一致性、变量选择一致性和渐近最优泛化界等良好的理论保证。具体而言,我们为具有混合高斯先验的稀疏DNN建立了后验一致性,表明可以使用基于拉普拉斯近似的边际后验包含概率方法一致地确定稀疏DNN的结构,并使用贝叶斯证据来引出通过诸如随机梯度下降等优化方法在多次不同初始化运行中学习到的稀疏DNN。对于大规模稀疏DNN,该方法在计算上比标准贝叶斯方法更高效。数值结果表明,该方法在大规模网络压缩和高维非线性变量选择方面表现出色,推动了可解释机器学习的发展。

相似文献

1
Consistent Sparse Deep Learning: Theory and Computation.一致稀疏深度学习:理论与计算
J Am Stat Assoc. 2022;117(540):1981-1995. doi: 10.1080/01621459.2021.1895175. Epub 2021 Apr 20.
3
SSGD: SPARSITY-PROMOTING STOCHASTIC GRADIENT DESCENT ALGORITHM FOR UNBIASED DNN PRUNING.SSGD:用于无偏深度神经网络剪枝的稀疏性促进随机梯度下降算法
Proc IEEE Int Conf Acoust Speech Signal Process. 2020 May;2020:5410-5414. doi: 10.1109/icassp40776.2020.9054436. Epub 2020 May 14.
7
An Efficient Sparse Bayesian Learning Algorithm Based on Gaussian-Scale Mixtures.一种基于高斯尺度混合的高效稀疏贝叶斯学习算法。
IEEE Trans Neural Netw Learn Syst. 2022 Jul;33(7):3065-3078. doi: 10.1109/TNNLS.2020.3049056. Epub 2022 Jul 6.
9
Knowledge Transfer-Based Sparse Deep Belief Network.基于知识转移的稀疏深度信念网络
IEEE Trans Cybern. 2023 Dec;53(12):7572-7583. doi: 10.1109/TCYB.2022.3173632. Epub 2023 Nov 29.

引用本文的文献

3
A survey of model compression techniques: past, present, and future.模型压缩技术综述:过去、现在与未来
Front Robot AI. 2025 Mar 20;12:1518965. doi: 10.3389/frobt.2025.1518965. eCollection 2025.
4
Extended fiducial inference: toward an automated process of statistical inference.扩展基准推断:迈向统计推断的自动化过程。
J R Stat Soc Series B Stat Methodol. 2024 Aug 5;87(1):98-131. doi: 10.1093/jrsssb/qkae082. eCollection 2025 Feb.
6
Deep network embedding with dimension selection.深度网络嵌入与维度选择。
Neural Netw. 2024 Nov;179:106512. doi: 10.1016/j.neunet.2024.106512. Epub 2024 Jul 11.

本文引用的文献

3
Bayesian Neural Networks for Selection of Drug Sensitive Genes.用于选择药物敏感基因的贝叶斯神经网络
J Am Stat Assoc. 2018;113(523):955-972. doi: 10.1080/01621459.2017.1409122. Epub 2018 Jun 28.
7
Error bounds for approximations with deep ReLU networks.深度 ReLU 网络逼近的误差界。
Neural Netw. 2017 Oct;94:103-114. doi: 10.1016/j.neunet.2017.07.002. Epub 2017 Jul 13.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验