• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

具有修正线性单元激活函数的多层神经网络解的几何性质和容量。

Properties of the Geometry of Solutions and Capacity of Multilayer Neural Networks with Rectified Linear Unit Activations.

机构信息

Artificial Intelligence Lab, Institute for Data Science and Analytics, Bocconi University, Milano 20135, Italy.

出版信息

Phys Rev Lett. 2019 Oct 25;123(17):170602. doi: 10.1103/PhysRevLett.123.170602.

DOI:10.1103/PhysRevLett.123.170602
PMID:31702271
Abstract

Rectified linear units (ReLUs) have become the main model for the neural units in current deep learning systems. This choice was originally suggested as a way to compensate for the so-called vanishing gradient problem which can undercut stochastic gradient descent learning in networks composed of multiple layers. Here we provide analytical results on the effects of ReLUs on the capacity and on the geometrical landscape of the solution space in two-layer neural networks with either binary or real-valued weights. We study the problem of storing an extensive number of random patterns and find that, quite unexpectedly, the capacity of the network remains finite as the number of neurons in the hidden layer increases, at odds with the case of threshold units in which the capacity diverges. Possibly more important, a large deviation approach allows us to find that the geometrical landscape of the solution space has a peculiar structure: While the majority of solutions are close in distance but still isolated, there exist rare regions of solutions which are much more dense than the similar ones in the case of threshold units. These solutions are robust to perturbations of the weights and can tolerate large perturbations of the inputs. The analytical results are corroborated by numerical findings.

摘要

修正线性单元(ReLU)已成为当前深度学习系统中神经单元的主要模型。这种选择最初是作为一种补偿所谓的梯度消失问题的方法提出的,该问题可能会削弱由多层组成的网络中的随机梯度下降学习。在这里,我们提供了关于 ReLU 对具有二进制或实值权重的两层神经网络的容量和解决方案空间几何景观的影响的分析结果。我们研究了存储大量随机模式的问题,并且出乎意料地发现,随着隐藏层中神经元数量的增加,网络的容量仍然是有限的,这与阈值单元的情况不同,在阈值单元中,容量发散。可能更重要的是,大偏差方法使我们能够发现解决方案空间的几何景观具有奇特的结构:虽然大多数解决方案在距离上接近但仍然是孤立的,但存在很少的解决方案区域,这些区域比阈值单元的类似区域密集得多。这些解决方案对权重的扰动具有鲁棒性,并且可以容忍输入的大扰动。分析结果得到了数值结果的证实。

相似文献

1
Properties of the Geometry of Solutions and Capacity of Multilayer Neural Networks with Rectified Linear Unit Activations.具有修正线性单元激活函数的多层神经网络解的几何性质和容量。
Phys Rev Lett. 2019 Oct 25;123(17):170602. doi: 10.1103/PhysRevLett.123.170602.
2
Role of Synaptic Stochasticity in Training Low-Precision Neural Networks.突触随机性在训练低精度神经网络中的作用。
Phys Rev Lett. 2018 Jun 29;120(26):268103. doi: 10.1103/PhysRevLett.120.268103.
3
Subdominant Dense Clusters Allow for Simple Learning and High Computational Performance in Neural Networks with Discrete Synapses.亚优势密集簇允许离散突触神经网络进行简单学习和高计算性能。
Phys Rev Lett. 2015 Sep 18;115(12):128101. doi: 10.1103/PhysRevLett.115.128101.
4
Learning through atypical phase transitions in overparameterized neural networks.通过过参数化神经网络中的非典型相变进行学习。
Phys Rev E. 2022 Jul;106(1-1):014116. doi: 10.1103/PhysRevE.106.014116.
5
An improvement of extreme learning machine for compact single-hidden-layer feedforward neural networks.用于紧凑型单隐层前馈神经网络的极限学习机改进方法。
Int J Neural Syst. 2008 Oct;18(5):433-41. doi: 10.1142/S0129065708001695.
6
Typical and atypical solutions in nonconvex neural networks with discrete and continuous weights.具有离散和连续权重的非凸神经网络中的典型和非典型解决方案。
Phys Rev E. 2023 Aug;108(2-1):024310. doi: 10.1103/PhysRevE.108.024310.
7
Shaping the learning landscape in neural networks around wide flat minima.围绕宽而平坦的极小值塑造神经网络的学习景观。
Proc Natl Acad Sci U S A. 2020 Jan 7;117(1):161-170. doi: 10.1073/pnas.1908636117. Epub 2019 Dec 23.
8
The No-Prop algorithm: a new learning algorithm for multilayer neural networks.无推动算法:一种多层神经网络的新学习算法。
Neural Netw. 2013 Jan;37:182-8. doi: 10.1016/j.neunet.2012.09.020. Epub 2012 Oct 15.
9
Multilayer neural networks for reduced-rank approximation.用于降秩逼近的多层神经网络。
IEEE Trans Neural Netw. 1994;5(5):684-97. doi: 10.1109/72.317721.
10
Efficient adaptive learning for classification tasks with binary units.用于二元单元分类任务的高效自适应学习。
Neural Comput. 1998 May 15;10(4):1007-30. doi: 10.1162/089976698300017601.

引用本文的文献

1
Nonlinear classification of neural manifolds with contextual information.具有上下文信息的神经流形的非线性分类。
Phys Rev E. 2025 Mar;111(3-2):035302. doi: 10.1103/PhysRevE.111.035302.
2
Establish and validate the reliability of predictive models in bone mineral density by deep learning as examination tool for women.建立和验证深度学习作为女性骨密度检查工具的预测模型的可靠性。
Osteoporos Int. 2024 Jan;35(1):129-141. doi: 10.1007/s00198-023-06913-5. Epub 2023 Sep 20.
3
Native-resolution myocardial principal Eulerian strain mapping using convolutional neural networks and Tagged Magnetic Resonance Imaging.
基于卷积神经网络和磁共振成像的心肌主欧拉应变原生分辨率映射。
Comput Biol Med. 2022 Feb;141:105041. doi: 10.1016/j.compbiomed.2021.105041. Epub 2021 Nov 18.
4
Identification of Novel Antagonists Targeting Cannabinoid Receptor 2 Using a Multi-Step Virtual Screening Strategy.采用多步虚拟筛选策略鉴定大麻素受体 2 的新型拮抗剂。
Molecules. 2021 Nov 4;26(21):6679. doi: 10.3390/molecules26216679.
5
Improved sequence-based prediction of interaction sites in α-helical transmembrane proteins by deep learning.通过深度学习改进基于序列的α-螺旋跨膜蛋白相互作用位点预测
Comput Struct Biotechnol J. 2021 Mar 9;19:1512-1530. doi: 10.1016/j.csbj.2021.03.005. eCollection 2021.
6
Shaping the learning landscape in neural networks around wide flat minima.围绕宽而平坦的极小值塑造神经网络的学习景观。
Proc Natl Acad Sci U S A. 2020 Jan 7;117(1):161-170. doi: 10.1073/pnas.1908636117. Epub 2019 Dec 23.