• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从数据分布和神经网络平滑度的角度量化深度学习中的泛化误差。

Quantifying the generalization error in deep learning in terms of data distribution and neural network smoothness.

机构信息

LSEC, ICMSEC, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China; School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China.

Division of Applied Mathematics, Brown University, Providence, RI 02912, USA.

出版信息

Neural Netw. 2020 Oct;130:85-99. doi: 10.1016/j.neunet.2020.06.024. Epub 2020 Jul 3.

DOI:10.1016/j.neunet.2020.06.024
PMID:32650153
Abstract

The accuracy of deep learning, i.e., deep neural networks, can be characterized by dividing the total error into three main types: approximation error, optimization error, and generalization error. Whereas there are some satisfactory answers to the problems of approximation and optimization, much less is known about the theory of generalization. Most existing theoretical works for generalization fail to explain the performance of neural networks in practice. To derive a meaningful bound, we study the generalization error of neural networks for classification problems in terms of data distribution and neural network smoothness. We introduce the cover complexity (CC) to measure the difficulty of learning a data set and the inverse of the modulus of continuity to quantify neural network smoothness. A quantitative bound for expected accuracy/error is derived by considering both the CC and neural network smoothness. Although most of the analysis is general and not specific to neural networks, we validate our theoretical assumptions and results numerically for neural networks by several data sets of images. The numerical results confirm that the expected error of trained networks scaled with the square root of the number of classes has a linear relationship with respect to the CC. We also observe a clear consistency between test loss and neural network smoothness during the training process. In addition, we demonstrate empirically that the neural network smoothness decreases when the network size increases whereas the smoothness is insensitive to training dataset size.

摘要

深度学习的准确性,即深度神经网络,可以通过将总误差分为三种主要类型来描述:逼近误差、优化误差和泛化误差。虽然对于逼近和优化问题已经有了一些令人满意的答案,但对于泛化理论的了解却少得多。大多数现有的泛化理论工作都无法解释神经网络在实践中的性能。为了得到一个有意义的界,我们从数据分布和神经网络平滑度的角度研究了分类问题中神经网络的泛化误差。我们引入了覆盖复杂度(CC)来衡量学习数据集的难度,并用模的倒数来量化神经网络的平滑度。通过同时考虑 CC 和神经网络的平滑度,我们推导出了一个关于期望精度/误差的定量界。虽然大部分分析都是通用的,而不是针对神经网络的,但我们通过几个图像数据集对神经网络的理论假设和结果进行了数值验证。数值结果证实,训练后的网络的期望误差与类的数量的平方根呈线性关系。我们还在训练过程中观察到测试损失和神经网络平滑度之间的明显一致性。此外,我们通过实验证明,当网络规模增加时,神经网络的平滑度会降低,而平滑度对训练数据集的大小不敏感。

相似文献

1
Quantifying the generalization error in deep learning in terms of data distribution and neural network smoothness.从数据分布和神经网络平滑度的角度量化深度学习中的泛化误差。
Neural Netw. 2020 Oct;130:85-99. doi: 10.1016/j.neunet.2020.06.024. Epub 2020 Jul 3.
2
High-dimensional dynamics of generalization error in neural networks.神经网络泛化误差的高维动力学。
Neural Netw. 2020 Dec;132:428-446. doi: 10.1016/j.neunet.2020.08.022. Epub 2020 Sep 5.
3
Generalization Analysis of Pairwise Learning for Ranking With Deep Neural Networks.基于深度神经网络的成对学习排序的泛化分析
Neural Comput. 2023 May 12;35(6):1135-1158. doi: 10.1162/neco_a_01585.
4
Upper bound of the expected training error of neural network regression for a Gaussian noise sequence.高斯噪声序列下神经网络回归预期训练误差的上界。
Neural Netw. 2001 Dec;14(10):1419-29. doi: 10.1016/s0893-6080(01)00122-8.
5
Approximation rates for neural networks with encodable weights in smoothness spaces.可编码权值神经网络在光滑性空间中的逼近速率。
Neural Netw. 2021 Feb;134:107-130. doi: 10.1016/j.neunet.2020.11.010. Epub 2020 Nov 27.
6
Why ResNet Works? Residuals Generalize.为什么ResNet有效?残差能够泛化。
IEEE Trans Neural Netw Learn Syst. 2020 Dec;31(12):5349-5362. doi: 10.1109/TNNLS.2020.2966319. Epub 2020 Nov 30.
7
An analysis of training and generalization errors in shallow and deep networks.浅析浅层网络和深层网络中的训练误差与泛化误差。
Neural Netw. 2020 Jan;121:229-241. doi: 10.1016/j.neunet.2019.08.028. Epub 2019 Sep 7.
8
Approximation of smooth functionals using deep ReLU networks.使用深度 ReLU 网络逼近光滑泛函。
Neural Netw. 2023 Sep;166:424-436. doi: 10.1016/j.neunet.2023.07.012. Epub 2023 Jul 18.
9
Generalization Analysis of CNNs for Classification on Spheres.用于球体分类的卷积神经网络泛化分析
IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):6200-6213. doi: 10.1109/TNNLS.2021.3134675. Epub 2023 Sep 1.
10
Going Deeper, Generalizing Better: An Information-Theoretic View for Deep Learning.深入挖掘,更好地泛化:深度学习的信息论视角
IEEE Trans Neural Netw Learn Syst. 2024 Nov;35(11):16683-16695. doi: 10.1109/TNNLS.2023.3297113. Epub 2024 Oct 29.

引用本文的文献

1
Transformer-based long-term predictor of subthalamic beta activity in Parkinson's disease.基于Transformer的帕金森病丘脑底核β活动长期预测器。
NPJ Parkinsons Dis. 2025 Jul 12;11(1):210. doi: 10.1038/s41531-025-01011-1.
2
A multimodal transformer system for noninvasive diabetic nephropathy diagnosis via retinal imaging.一种用于通过视网膜成像进行无创糖尿病肾病诊断的多模态变压器系统。
NPJ Digit Med. 2025 Jan 24;8(1):50. doi: 10.1038/s41746-024-01393-1.
3
Physics-informed two-tier neural network for non-linear model order reduction.
用于非线性模型降阶的物理信息双层神经网络。
Adv Model Simul Eng Sci. 2024;11(1):20. doi: 10.1186/s40323-024-00273-3. Epub 2024 Nov 14.
4
Effective data sampling strategies and boundary condition constraints of physics-informed neural networks for identifying material properties in solid mechanics.用于识别固体力学中材料特性的物理信息神经网络的有效数据采样策略和边界条件约束
Appl Math Mech. 2023 Jul;44(7):1039-1068. doi: 10.1007/s10483-023-2995-8. Epub 2023 Jul 3.
5
Adaptive Checkpoint Adjoint Method for Gradient Estimation in Neural ODE.神经常微分方程中用于梯度估计的自适应检查点伴随方法
Proc Mach Learn Res. 2020;119:11639-11649.