• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于深度网络基准测试的实用泛化度量标准。

A practical generalization metric for deep networks benchmarking.

作者信息

Huang Mengqing, Yu Hongchuan, Zhang Jianjun

机构信息

National Centre for Computer Animation, Bournemouth University, Poole, BH12 5BB, UK.

出版信息

Sci Rep. 2025 Mar 21;15(1):9747. doi: 10.1038/s41598-025-93005-5.

DOI:10.1038/s41598-025-93005-5
PMID:40119019
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11928604/
Abstract

There is an ongoing and dedicated effort to estimate bounds on the generalization error of deep learning models, coupled with an increasing interest with practical metrics that can be used to experimentally evaluate a model's ability to generalize. This interest is not only driven by practical considerations but is also vital for theoretical research, as theoretical estimations require practical validation. However, there is currently a lack of research on benchmarking the generalization capacity of various deep networks and verifying these theoretical estimations. This paper aims to introduce a practical generalization metric for benchmarking different deep networks and proposes a novel testbed for the verification of theoretical estimations. Our findings indicate that a deep network's generalization capacity in classification tasks is contingent upon both classification accuracy and the diversity of unseen data. The proposed metric system is capable of quantifying the accuracy of deep learning models and the diversity of data, providing an intuitive and quantitative evaluation method - a trade-off point. Furthermore, we compare our practical metric with existing generalization theoretical estimations using our benchmarking testbed. It is discouraging to note that most of the available generalization estimations do not correlate with the practical measurements obtained using our testbed. On the other hand, this finding is significant as it exposes the shortcomings of theoretical estimations and inspires new exploration.

摘要

人们正在持续且专注地努力估计深度学习模型泛化误差的界限,同时,对于可用于通过实验评估模型泛化能力的实用指标的兴趣也与日俱增。这种兴趣不仅受到实际考量的驱动,对于理论研究也至关重要,因为理论估计需要实际验证。然而,目前缺乏关于对各种深度网络的泛化能力进行基准测试并验证这些理论估计的研究。本文旨在引入一种用于对不同深度网络进行基准测试的实用泛化指标,并提出一个用于验证理论估计的新型测试平台。我们的研究结果表明,深度网络在分类任务中的泛化能力取决于分类准确率和未见数据的多样性。所提出的指标系统能够量化深度学习模型的准确性和数据的多样性,提供一种直观且定量的评估方法——一个权衡点。此外,我们使用我们的基准测试平台将我们的实用指标与现有的泛化理论估计进行比较。值得注意的是,大多数现有的泛化估计与使用我们的测试平台获得的实际测量结果不相关,这令人沮丧。另一方面,这一发现意义重大,因为它揭示了理论估计的缺点并激发了新的探索。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c7ca/11928604/309da4461896/41598_2025_93005_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c7ca/11928604/56c99cff395b/41598_2025_93005_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c7ca/11928604/6cea9911a42b/41598_2025_93005_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c7ca/11928604/bee26ac1e745/41598_2025_93005_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c7ca/11928604/ad2232b98309/41598_2025_93005_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c7ca/11928604/309da4461896/41598_2025_93005_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c7ca/11928604/56c99cff395b/41598_2025_93005_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c7ca/11928604/6cea9911a42b/41598_2025_93005_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c7ca/11928604/bee26ac1e745/41598_2025_93005_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c7ca/11928604/ad2232b98309/41598_2025_93005_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c7ca/11928604/309da4461896/41598_2025_93005_Fig5_HTML.jpg

相似文献

1
A practical generalization metric for deep networks benchmarking.一种用于深度网络基准测试的实用泛化度量标准。
Sci Rep. 2025 Mar 21;15(1):9747. doi: 10.1038/s41598-025-93005-5.
2
Brain tumor segmentation and detection in MRI using convolutional neural networks and VGG16.使用卷积神经网络和VGG16在磁共振成像(MRI)中进行脑肿瘤分割与检测
Cancer Biomark. 2025 Mar;42(3):18758592241311184. doi: 10.1177/18758592241311184. Epub 2025 Apr 4.
3
Smooth-Guided Implicit Data Augmentation for Domain Generalization.用于领域泛化的平滑引导隐式数据增强
IEEE Trans Neural Netw Learn Syst. 2025 Mar;36(3):4984-4995. doi: 10.1109/TNNLS.2024.3377439. Epub 2025 Feb 28.
4
Generalization challenges in electrocardiogram deep learning: insights from dataset characteristics and attention mechanism.心电图深度学习中的泛化挑战:来自数据集特征和注意力机制的见解。
Future Cardiol. 2024 Mar 11;20(4):209-220. doi: 10.1080/14796678.2024.2354082. Epub 2024 Jun 5.
5
Quantifying the generalization error in deep learning in terms of data distribution and neural network smoothness.从数据分布和神经网络平滑度的角度量化深度学习中的泛化误差。
Neural Netw. 2020 Oct;130:85-99. doi: 10.1016/j.neunet.2020.06.024. Epub 2020 Jul 3.
6
An Optimal Transport Analysis on Generalization in Deep Learning.深度学习中的泛化的最优传输分析。
IEEE Trans Neural Netw Learn Syst. 2023 Jun;34(6):2842-2853. doi: 10.1109/TNNLS.2021.3109942. Epub 2023 Jun 1.
7
Generalization analysis of adversarial pairwise learning.
Neural Netw. 2025 Mar;183:106955. doi: 10.1016/j.neunet.2024.106955. Epub 2024 Nov 28.
8
Ensemble machine learning model trained on a new synthesized dataset generalizes well for stress prediction using wearable devices.在新合成数据集上训练的集成机器学习模型,对于使用可穿戴设备进行压力预测具有良好的泛化能力。
J Biomed Inform. 2023 Dec;148:104556. doi: 10.1016/j.jbi.2023.104556. Epub 2023 Dec 2.
9
Deep learning uncertainty quantification for clinical text classification.深度学习在临床文本分类中的不确定性量化。
J Biomed Inform. 2024 Jan;149:104576. doi: 10.1016/j.jbi.2023.104576. Epub 2023 Dec 13.
10
CNN-Bi-LSTM: A Complex Environment-Oriented Cattle Behavior Classification Network Based on the Fusion of CNN and Bi-LSTM.CNN-Bi-LSTM:一种基于 CNN 和 Bi-LSTM 融合的复杂环境导向牛行为分类网络。
Sensors (Basel). 2023 Sep 6;23(18):7714. doi: 10.3390/s23187714.

本文引用的文献

1
Semantic Relatedness Emerges in Deep Convolutional Neural Networks Designed for Object Recognition.语义相关性在为目标识别设计的深度卷积神经网络中显现。
Front Comput Neurosci. 2021 Feb 22;15:625804. doi: 10.3389/fncom.2021.625804. eCollection 2021.