• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一致稀疏深度学习:理论与计算

Consistent Sparse Deep Learning: Theory and Computation.

作者信息

Sun Yan, Song Qifan, Liang Faming

机构信息

Department of Statistics, Purdue University, West Lafayette, IN 47907.

出版信息

J Am Stat Assoc. 2022;117(540):1981-1995. doi: 10.1080/01621459.2021.1895175. Epub 2021 Apr 20.

DOI:10.1080/01621459.2021.1895175
PMID:36945326
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10027379/
Abstract

Deep learning has been the engine powering many successes of data science. However, the deep neural network (DNN), as the basic model of deep learning, is often excessively over-parameterized, causing many difficulties in training, prediction and interpretation. We propose a frequentist-like method for learning sparse DNNs and justify its consistency under the Bayesian framework: the proposed method could learn a sparse DNN with at most (/log()) connections and nice theoretical guarantees such as posterior consistency, variable selection consistency and asymptotically optimal generalization bounds. In particular, we establish posterior consistency for the sparse DNN with a mixture Gaussian prior, show that the structure of the sparse DNN can be consistently determined using a Laplace approximation-based marginal posterior inclusion probability approach, and use Bayesian evidence to elicit sparse DNNs learned by an optimization method such as stochastic gradient descent in multiple runs with different initializations. The proposed method is computationally more efficient than standard Bayesian methods for large-scale sparse DNNs. The numerical results indicate that the proposed method can perform very well for large-scale network compression and high-dimensional nonlinear variable selection, both advancing interpretable machine learning.

摘要

深度学习一直是推动数据科学诸多成功的引擎。然而,深度神经网络(DNN)作为深度学习的基本模型,往往参数过度冗余,在训练、预测和解释方面带来诸多困难。我们提出一种类似频率学派的方法来学习稀疏DNN,并在贝叶斯框架下证明其一致性:该方法能够学习最多具有(/log())连接的稀疏DNN,并具有诸如后验一致性、变量选择一致性和渐近最优泛化界等良好的理论保证。具体而言,我们为具有混合高斯先验的稀疏DNN建立了后验一致性,表明可以使用基于拉普拉斯近似的边际后验包含概率方法一致地确定稀疏DNN的结构,并使用贝叶斯证据来引出通过诸如随机梯度下降等优化方法在多次不同初始化运行中学习到的稀疏DNN。对于大规模稀疏DNN,该方法在计算上比标准贝叶斯方法更高效。数值结果表明,该方法在大规模网络压缩和高维非线性变量选择方面表现出色,推动了可解释机器学习的发展。

相似文献

1
Consistent Sparse Deep Learning: Theory and Computation.一致稀疏深度学习:理论与计算
J Am Stat Assoc. 2022;117(540):1981-1995. doi: 10.1080/01621459.2021.1895175. Epub 2021 Apr 20.
2
Learning Sparse Deep Neural Networks with a Spike-and-Slab Prior.基于尖峰和平板先验学习稀疏深度神经网络。
Stat Probab Lett. 2022 Jan;180. doi: 10.1016/j.spl.2021.109246. Epub 2021 Sep 24.
3
SSGD: SPARSITY-PROMOTING STOCHASTIC GRADIENT DESCENT ALGORITHM FOR UNBIASED DNN PRUNING.SSGD:用于无偏深度神经网络剪枝的稀疏性促进随机梯度下降算法
Proc IEEE Int Conf Acoust Speech Signal Process. 2020 May;2020:5410-5414. doi: 10.1109/icassp40776.2020.9054436. Epub 2020 May 14.
4
Nonconvex Sparse Regularization for Deep Neural Networks and Its Optimality.非凸稀疏正则化在深度神经网络中的应用及其最优性。
Neural Comput. 2022 Jan 14;34(2):476-517. doi: 10.1162/neco_a_01457.
5
Layer adaptive node selection in Bayesian neural networks: Statistical guarantees and implementation details.贝叶斯神经网络中的层自适应节点选择:统计保证与实现细节。
Neural Netw. 2023 Oct;167:309-330. doi: 10.1016/j.neunet.2023.08.029. Epub 2023 Aug 22.
6
Prediction of Compound Profiling Matrices, Part II: Relative Performance of Multitask Deep Learning and Random Forest Classification on the Basis of Varying Amounts of Training Data.化合物特征矩阵预测,第二部分:基于不同数量训练数据的多任务深度学习和随机森林分类的相对性能
ACS Omega. 2018 Sep 30;3(9):12033-12040. doi: 10.1021/acsomega.8b01682. Epub 2018 Sep 27.
7
An Efficient Sparse Bayesian Learning Algorithm Based on Gaussian-Scale Mixtures.一种基于高斯尺度混合的高效稀疏贝叶斯学习算法。
IEEE Trans Neural Netw Learn Syst. 2022 Jul;33(7):3065-3078. doi: 10.1109/TNNLS.2020.3049056. Epub 2022 Jul 6.
8
Transformed ℓ regularization for learning sparse deep neural networks.ℓ 正则化变换在稀疏深度神经网络学习中的应用。
Neural Netw. 2019 Nov;119:286-298. doi: 10.1016/j.neunet.2019.08.015. Epub 2019 Aug 27.
9
Knowledge Transfer-Based Sparse Deep Belief Network.基于知识转移的稀疏深度信念网络
IEEE Trans Cybern. 2023 Dec;53(12):7572-7583. doi: 10.1109/TCYB.2022.3173632. Epub 2023 Nov 29.
10
Deep neural network-based prediction of tsunami wave attenuation by mangrove forests.基于深度神经网络的红树林对海啸波衰减的预测
MethodsX. 2024 Jun 11;13:102791. doi: 10.1016/j.mex.2024.102791. eCollection 2024 Dec.

引用本文的文献

1
Extended fiducial inference for individual treatment effects via deep neural networks.通过深度神经网络进行个体治疗效果的扩展基准推断。
Stat Comput. 2025;35(4):97. doi: 10.1007/s11222-025-10624-8. Epub 2025 May 17.
2
A New Paradigm for Generative Adversarial Networks based on Randomized Decision Rules.基于随机决策规则的生成对抗网络新范式。
Stat Sin. 2025 Apr;35(2):897-918. doi: 10.5705/ss.202022.0404.
3
A survey of model compression techniques: past, present, and future.模型压缩技术综述:过去、现在与未来
Front Robot AI. 2025 Mar 20;12:1518965. doi: 10.3389/frobt.2025.1518965. eCollection 2025.
4
Extended fiducial inference: toward an automated process of statistical inference.扩展基准推断:迈向统计推断的自动化过程。
J R Stat Soc Series B Stat Methodol. 2024 Aug 5;87(1):98-131. doi: 10.1093/jrsssb/qkae082. eCollection 2025 Feb.
5
A Double Regression Method for Graphical Modeling of High-dimensional Nonlinear and Non-Gaussian Data.一种用于高维非线性和非高斯数据图形建模的双重回归方法。
Stat Interface. 2024;17(4):669-680. doi: 10.4310/22-sii756.
6
Deep network embedding with dimension selection.深度网络嵌入与维度选择。
Neural Netw. 2024 Nov;179:106512. doi: 10.1016/j.neunet.2024.106512. Epub 2024 Jul 11.
7
A phase transition for finding needles in nonlinear haystacks with LASSO artificial neural networks.使用套索人工神经网络在非线性干草堆中寻找针的相变。
Stat Comput. 2022;32(6):99. doi: 10.1007/s11222-022-10169-0. Epub 2022 Oct 22.
8
Learning Sparse Deep Neural Networks with a Spike-and-Slab Prior.基于尖峰和平板先验学习稀疏深度神经网络。
Stat Probab Lett. 2022 Jan;180. doi: 10.1016/j.spl.2021.109246. Epub 2021 Sep 24.

本文引用的文献

1
Extended Stochastic Gradient MCMC for Large-Scale Bayesian Variable Selection.用于大规模贝叶斯变量选择的扩展随机梯度马尔可夫链蒙特卡罗方法
Biometrika. 2020 Dec;107(4):997-1004. doi: 10.1093/biomet/asaa029. Epub 2020 Jul 13.
2
Transformed ℓ regularization for learning sparse deep neural networks.ℓ 正则化变换在稀疏深度神经网络学习中的应用。
Neural Netw. 2019 Nov;119:286-298. doi: 10.1016/j.neunet.2019.08.015. Epub 2019 Aug 27.
3
Bayesian Neural Networks for Selection of Drug Sensitive Genes.用于选择药物敏感基因的贝叶斯神经网络
J Am Stat Assoc. 2018;113(523):955-972. doi: 10.1080/01621459.2017.1409122. Epub 2018 Jun 28.
4
A comparison of deep networks with ReLU activation function and linear spline-type methods.ReLU 激活函数的深度网络与线性样条型方法的比较。
Neural Netw. 2019 Feb;110:232-242. doi: 10.1016/j.neunet.2018.11.005. Epub 2018 Dec 4.
5
Optimal approximation of piecewise smooth functions using deep ReLU neural networks.使用深度 ReLU 神经网络对分段光滑函数进行最优逼近。
Neural Netw. 2018 Dec;108:296-330. doi: 10.1016/j.neunet.2018.08.019. Epub 2018 Sep 7.
6
Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science.基于网络科学的自适应稀疏连接启发的人工神经网络的可扩展训练。
Nat Commun. 2018 Jun 19;9(1):2383. doi: 10.1038/s41467-018-04316-3.
7
Error bounds for approximations with deep ReLU networks.深度 ReLU 网络逼近的误差界。
Neural Netw. 2017 Oct;94:103-114. doi: 10.1016/j.neunet.2017.07.002. Epub 2017 Jul 13.