• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于衡量神经架构训练效率的框架。

A framework for measuring the training efficiency of a neural architecture.

作者信息

Cueto-Mendoza Eduardo, Kelleher John

机构信息

School of Computer Science, TU Dublin, Grangegorman, Dublin 7, D07H6K8 Co. Dublin Ireland.

ADAPT Research Centre, School of Computer Science and Statistics, Trinity College Dublin, Dublin 2, Co. Dublin Ireland.

出版信息

Artif Intell Rev. 2024;57(12):349. doi: 10.1007/s10462-024-10943-8. Epub 2024 Oct 28.

DOI:10.1007/s10462-024-10943-8
PMID:39478973
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11519118/
Abstract

Measuring Efficiency in neural network system development is an open research problem. This paper presents an experimental framework to measure the training efficiency of a neural architecture. To demonstrate our approach, we analyze the training efficiency of Convolutional Neural Networks and Bayesian equivalents on the MNIST and CIFAR-10 tasks. Our results show that training efficiency decays as training progresses and varies across different stopping criteria for a given neural model and learning task. We also find a non-linear relationship between training stopping criteria, training Efficiency, model size, and training Efficiency. Furthermore, we illustrate the potential confounding effects of overtraining on measuring the training efficiency of a neural architecture. Regarding relative training efficiency across different architectures, our results indicate that CNNs are more efficient than BCNNs on both datasets. More generally, as a learning task becomes more complex, the relative difference in training efficiency between different architectures becomes more pronounced.

摘要

衡量神经网络系统开发中的效率是一个开放的研究问题。本文提出了一个实验框架来衡量神经架构的训练效率。为了演示我们的方法,我们分析了卷积神经网络和贝叶斯等效模型在MNIST和CIFAR - 10任务上的训练效率。我们的结果表明,随着训练的进行,训练效率会下降,并且对于给定的神经模型和学习任务,不同的停止标准下训练效率也会有所不同。我们还发现训练停止标准、训练效率、模型大小和训练效率之间存在非线性关系。此外,我们说明了过度训练对衡量神经架构训练效率的潜在混淆效应。关于不同架构之间的相对训练效率,我们的结果表明,在这两个数据集上,卷积神经网络比贝叶斯卷积神经网络更高效。更一般地说,随着学习任务变得更加复杂,不同架构之间训练效率的相对差异会变得更加明显。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/85e573605450/10462_2024_10943_Fig15_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/a427e59fc962/10462_2024_10943_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/89d59ab6d5a4/10462_2024_10943_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/95fa814f4742/10462_2024_10943_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/3ce0bcc4cf39/10462_2024_10943_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/c7f3e4b6523e/10462_2024_10943_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/a2dc53c57423/10462_2024_10943_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/835e332e0bb3/10462_2024_10943_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/887075372a54/10462_2024_10943_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/f3c2c531c207/10462_2024_10943_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/d984793f0410/10462_2024_10943_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/d07ef251f28b/10462_2024_10943_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/3e2532a358ff/10462_2024_10943_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/989ab6811013/10462_2024_10943_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/4a043ff35cc2/10462_2024_10943_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/85e573605450/10462_2024_10943_Fig15_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/a427e59fc962/10462_2024_10943_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/89d59ab6d5a4/10462_2024_10943_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/95fa814f4742/10462_2024_10943_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/3ce0bcc4cf39/10462_2024_10943_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/c7f3e4b6523e/10462_2024_10943_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/a2dc53c57423/10462_2024_10943_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/835e332e0bb3/10462_2024_10943_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/887075372a54/10462_2024_10943_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/f3c2c531c207/10462_2024_10943_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/d984793f0410/10462_2024_10943_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/d07ef251f28b/10462_2024_10943_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/3e2532a358ff/10462_2024_10943_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/989ab6811013/10462_2024_10943_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/4a043ff35cc2/10462_2024_10943_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59e7/11519118/85e573605450/10462_2024_10943_Fig15_HTML.jpg

相似文献

1
A framework for measuring the training efficiency of a neural architecture.一种用于衡量神经架构训练效率的框架。
Artif Intell Rev. 2024;57(12):349. doi: 10.1007/s10462-024-10943-8. Epub 2024 Oct 28.
2
SIRe-Networks: Convolutional neural networks architectural extension for information preservation via skip/residual connections and interlaced auto-encoders.SIRe-Networks:通过跳传/残差连接和交错自编码器实现信息保留的卷积神经网络架构扩展。
Neural Netw. 2022 Sep;153:386-398. doi: 10.1016/j.neunet.2022.06.030. Epub 2022 Jun 27.
3
Self-Growing Binary Activation Network: A Novel Deep Learning Model With Dynamic Architecture.自增长二元激活网络:一种具有动态架构的新型深度学习模型。
IEEE Trans Neural Netw Learn Syst. 2022 May 27;PP. doi: 10.1109/TNNLS.2022.3176027.
4
MABAL: a Novel Deep-Learning Architecture for Machine-Assisted Bone Age Labeling.MABAL:一种用于机器辅助骨龄标注的新型深度学习架构。
J Digit Imaging. 2018 Aug;31(4):513-519. doi: 10.1007/s10278-018-0053-3.
5
An Experimental Review on Deep Learning Architectures for Time Series Forecasting.深度学习架构在时间序列预测中的实验研究综述
Int J Neural Syst. 2021 Mar;31(3):2130001. doi: 10.1142/S0129065721300011. Epub 2021 Feb 16.
6
Evolution of Deep Convolutional Neural Networks Using Cartesian Genetic Programming.基于笛卡尔遗传编程的深度卷积神经网络进化。
Evol Comput. 2020 Spring;28(1):141-163. doi: 10.1162/evco_a_00253. Epub 2019 Mar 22.
7
Conversion of Continuous-Valued Deep Networks to Efficient Event-Driven Networks for Image Classification.将连续值深度网络转换为用于图像分类的高效事件驱动网络
Front Neurosci. 2017 Dec 7;11:682. doi: 10.3389/fnins.2017.00682. eCollection 2017.
8
Deep Convolutional Neural Networks for large-scale speech tasks.用于大规模语音任务的深度卷积神经网络。
Neural Netw. 2015 Apr;64:39-48. doi: 10.1016/j.neunet.2014.08.005. Epub 2014 Sep 16.
9
Enabling Spike-Based Backpropagation for Training Deep Neural Network Architectures.实现基于尖峰的反向传播以训练深度神经网络架构。
Front Neurosci. 2020 Feb 28;14:119. doi: 10.3389/fnins.2020.00119. eCollection 2020.
10
Training Lightweight Deep Convolutional Neural Networks Using Bag-of-Features Pooling.使用特征袋池化训练轻量级深度卷积神经网络
IEEE Trans Neural Netw Learn Syst. 2019 Jun;30(6):1705-1715. doi: 10.1109/TNNLS.2018.2872995. Epub 2018 Oct 24.

本文引用的文献

1
Enhancing deep neural network training efficiency and performance through linear prediction.通过线性预测提高深度神经网络的训练效率和性能。
Sci Rep. 2024 Jul 2;14(1):15197. doi: 10.1038/s41598-024-65691-0.
2
EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone Training.高效训练++:用于高效视觉主干训练的广义课程学习
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8036-8055. doi: 10.1109/TPAMI.2024.3401036. Epub 2024 Nov 6.
3
The role of artificial intelligence in achieving the Sustainable Development Goals.
人工智能在实现可持续发展目标中的作用。
Nat Commun. 2020 Jan 13;11(1):233. doi: 10.1038/s41467-019-14108-y.