• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

浅析浅层网络和深层网络中的训练误差与泛化误差。

An analysis of training and generalization errors in shallow and deep networks.

机构信息

Institute of Mathematical Sciences, Claremont Graduate University, Claremont, CA 91711, United States of America.

Center for Brains, Minds, and Machines, McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, 02139, United States of America.

出版信息

Neural Netw. 2020 Jan;121:229-241. doi: 10.1016/j.neunet.2019.08.028. Epub 2019 Sep 7.

DOI:10.1016/j.neunet.2019.08.028
PMID:31574413
Abstract

This paper is motivated by an open problem around deep networks, namely, the apparent absence of over-fitting despite large over-parametrization which allows perfect fitting of the training data. In this paper, we analyze this phenomenon in the case of regression problems when each unit evaluates a periodic activation function. We argue that the minimal expected value of the square loss is inappropriate to measure the generalization error in approximation of compositional functions in order to take full advantage of the compositional structure. Instead, we measure the generalization error in the sense of maximum loss, and sometimes, as a pointwise error. We give estimates on exactly how many parameters ensure both zero training error as well as a good generalization error. We prove that a solution of a regularization problem is guaranteed to yield a good training error as well as a good generalization error and estimate how much error to expect at which test data.

摘要

本文针对深度网络中的一个开放性问题展开研究,即尽管深度网络的参数过多,但却不存在过拟合现象,因为其可以完美拟合训练数据。在本文中,我们在回归问题的情况下分析了这一现象,其中每个单元评估一个周期性激活函数。我们认为,最小化平方损失值并不适合衡量组合函数逼近中的泛化误差,因为这样无法充分利用组合结构。相反,我们采用最大损失意义下的泛化误差进行度量,有时也采用逐点误差进行度量。我们给出了具体的参数数量估计,以确保同时实现零训练误差和良好的泛化误差。我们证明了正则化问题的解可以保证实现良好的训练误差和良好的泛化误差,并估计在哪些测试数据上会产生多少误差。

相似文献

1
An analysis of training and generalization errors in shallow and deep networks.浅析浅层网络和深层网络中的训练误差与泛化误差。
Neural Netw. 2020 Jan;121:229-241. doi: 10.1016/j.neunet.2019.08.028. Epub 2019 Sep 7.
2
Theoretical issues in deep networks.深度网络中的理论问题。
Proc Natl Acad Sci U S A. 2020 Dec 1;117(48):30039-30045. doi: 10.1073/pnas.1907369117. Epub 2020 Jun 9.
3
Quantifying the generalization error in deep learning in terms of data distribution and neural network smoothness.从数据分布和神经网络平滑度的角度量化深度学习中的泛化误差。
Neural Netw. 2020 Oct;130:85-99. doi: 10.1016/j.neunet.2020.06.024. Epub 2020 Jul 3.
4
Dimension independent bounds for general shallow networks.广义浅层网络的维数无关界。
Neural Netw. 2020 Mar;123:142-152. doi: 10.1016/j.neunet.2019.11.006. Epub 2019 Nov 22.
5
Simultaneous approximation of a smooth function and its derivatives by deep neural networks with piecewise-polynomial activations.使用具有分段多项式激活函数的深度神经网络对光滑函数及其导数进行同时逼近。
Neural Netw. 2023 Apr;161:242-253. doi: 10.1016/j.neunet.2023.01.035. Epub 2023 Feb 2.
6
Upper bound of the expected training error of neural network regression for a Gaussian noise sequence.高斯噪声序列下神经网络回归预期训练误差的上界。
Neural Netw. 2001 Dec;14(10):1419-29. doi: 10.1016/s0893-6080(01)00122-8.
7
Going Deeper, Generalizing Better: An Information-Theoretic View for Deep Learning.深入挖掘,更好地泛化:深度学习的信息论视角
IEEE Trans Neural Netw Learn Syst. 2024 Nov;35(11):16683-16695. doi: 10.1109/TNNLS.2023.3297113. Epub 2024 Oct 29.
8
Fast generalization error bound of deep learning without scale invariance of activation functions.深度学习在激活函数无尺度不变性情况下的快速泛化误差界。
Neural Netw. 2020 Sep;129:344-358. doi: 10.1016/j.neunet.2020.05.033. Epub 2020 Jun 22.
9
On the problem in model selection of neural network regression in overrealizable scenario.关于超可实现场景下神经网络回归模型选择中的问题。
Neural Comput. 2002 Aug;14(8):1979-2002. doi: 10.1162/089976602760128090.
10
Theory of deep convolutional neural networks III: Approximating radial functions.深度卷积神经网络理论 III:逼近径向函数。
Neural Netw. 2021 Dec;144:778-790. doi: 10.1016/j.neunet.2021.09.027. Epub 2021 Oct 6.