• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于高吞吐量和节能推理的计算与内存优化光谱域卷积神经网络

Computation and memory optimized spectral domain convolutional neural network for throughput and energy-efficient inference.

作者信息

Rizvi Shahriyar Masud, Rahman Ab Al-Hadi Ab, Sheikh Usman Ullah, Fuad Kazi Ahmed Asif, Shehzad Hafiz Muhammad Faisal

机构信息

VeCAD Research Laboratory, School of Electrical Engineering, Universiti Teknologi Malaysia, Johor Bahru, 81310 Johor Malaysia.

School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331 USA.

出版信息

Appl Intell (Dordr). 2023;53(4):4499-4523. doi: 10.1007/s10489-022-03756-1. Epub 2022 Jun 11.

DOI:10.1007/s10489-022-03756-1
PMID:35730044
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9188280/
Abstract

Conventional convolutional neural networks (CNNs) present a high computational workload and memory access cost (CMC). Spectral domain CNNs (SpCNNs) offer a computationally efficient approach to compute CNN training and inference. This paper investigates CMC of SpCNNs and its contributing components analytically and then proposes a methodology to optimize CMC, under three strategies, to enhance inference performance. In this methodology, output feature map (OFM) size, OFM depth or both are progressively reduced under an accuracy constraint to compute performance-optimized CNN inference. Before conducting training or testing, it can provide designers guidelines and preliminary insights regarding techniques for optimum performance, least degradation in accuracy and a balanced performance-accuracy trade-off. This methodology was evaluated on MNIST and Fashion MNIST datasets using LeNet-5 and AlexNet architectures. When compared to state-of-the-art SpCNN models, LeNet-5 achieves up to 4.2× (batch inference) and 4.1× (single-image inference) higher throughputs and 10.5× (batch inference) and 4.2× (single-image inference) greater energy efficiency at a maximum loss of 3% in test accuracy. When compared to the baseline model used in this study, AlexNet delivers 11.6× (batch inference) and 5× (single-image inference) increased throughput and 25× (batch inference) and 8.8× (single-image inference) more energy-efficient inference with just 4.4% reduction in accuracy.

摘要

传统卷积神经网络(CNN)存在高计算工作量和内存访问成本(CMC)的问题。谱域CNN(SpCNN)提供了一种计算效率高的方法来进行CNN训练和推理。本文对SpCNN的CMC及其组成部分进行了分析研究,然后提出了一种在三种策略下优化CMC以提高推理性能的方法。在该方法中,在精度约束下逐步减小输出特征图(OFM)大小、OFM深度或两者,以计算性能优化的CNN推理。在进行训练或测试之前,它可以为设计者提供有关实现最佳性能、最小精度下降和平衡性能与精度权衡的技术的指导方针和初步见解。该方法在MNIST和Fashion MNIST数据集上使用LeNet-5和AlexNet架构进行了评估。与最先进的SpCNN模型相比,LeNet-5在测试精度最大损失3%的情况下,批量推理吞吐量提高了4.2倍,单图像推理吞吐量提高了4.1倍,能量效率提高了10.5倍(批量推理)和4.2倍(单图像推理)。与本研究中使用的基线模型相比,AlexNet的吞吐量提高了11.6倍(批量推理)和5倍(单图像推理),推理能量效率提高了25倍(批量推理)和8.8倍(单图像推理),而精度仅降低了4.4%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/1e351e9e8e80/10489_2022_3756_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/c1cb284fe245/10489_2022_3756_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/91504e87f95e/10489_2022_3756_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/6503bbc2b358/10489_2022_3756_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/125c6266b0c0/10489_2022_3756_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/76b55a55da24/10489_2022_3756_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/e50b398c570a/10489_2022_3756_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/1615aeba68e5/10489_2022_3756_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/eb5242d3a09b/10489_2022_3756_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/e719fb8a0e61/10489_2022_3756_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/1e351e9e8e80/10489_2022_3756_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/c1cb284fe245/10489_2022_3756_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/91504e87f95e/10489_2022_3756_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/6503bbc2b358/10489_2022_3756_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/125c6266b0c0/10489_2022_3756_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/76b55a55da24/10489_2022_3756_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/e50b398c570a/10489_2022_3756_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/1615aeba68e5/10489_2022_3756_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/eb5242d3a09b/10489_2022_3756_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/e719fb8a0e61/10489_2022_3756_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d583/9188280/1e351e9e8e80/10489_2022_3756_Fig10_HTML.jpg

相似文献

1
Computation and memory optimized spectral domain convolutional neural network for throughput and energy-efficient inference.用于高吞吐量和节能推理的计算与内存优化光谱域卷积神经网络
Appl Intell (Dordr). 2023;53(4):4499-4523. doi: 10.1007/s10489-022-03756-1. Epub 2022 Jun 11.
2
Effective Plug-Ins for Reducing Inference-Latency of Spiking Convolutional Neural Networks During Inference Phase.用于在推理阶段降低脉冲卷积神经网络推理延迟的有效插件
Front Comput Neurosci. 2021 Oct 18;15:697469. doi: 10.3389/fncom.2021.697469. eCollection 2021.
3
Design of Fully Spectral CNNs for Efficient FPGA-Based Acceleration.用于基于现场可编程门阵列(FPGA)的高效加速的全谱卷积神经网络(CNN)设计
IEEE Trans Neural Netw Learn Syst. 2024 Jun;35(6):8111-8123. doi: 10.1109/TNNLS.2022.3224779. Epub 2024 Jun 3.
4
Conversion of Continuous-Valued Deep Networks to Efficient Event-Driven Networks for Image Classification.将连续值深度网络转换为用于图像分类的高效事件驱动网络
Front Neurosci. 2017 Dec 7;11:682. doi: 10.3389/fnins.2017.00682. eCollection 2017.
5
A Scatter-and-Gather Spiking Convolutional Neural Network on a Reconfigurable Neuromorphic Hardware.一种基于可重构神经形态硬件的散射与聚集脉冲卷积神经网络。
Front Neurosci. 2021 Nov 16;15:694170. doi: 10.3389/fnins.2021.694170. eCollection 2021.
6
DT-SCNN: dual-threshold spiking convolutional neural network with fewer operations and memory access for edge applications.DT-SCNN:用于边缘应用的具有更少运算和内存访问的双阈值脉冲卷积神经网络。
Front Comput Neurosci. 2024 May 30;18:1418115. doi: 10.3389/fncom.2024.1418115. eCollection 2024.
7
A Low-Power Hardware Architecture for Real-Time CNN Computing.用于实时卷积神经网络计算的低功耗硬件架构。
Sensors (Basel). 2023 Feb 11;23(4):2045. doi: 10.3390/s23042045.
8
Toward Compact ConvNets via Structure-Sparsity Regularized Filter Pruning.通过结构稀疏正则化滤波器剪枝实现紧凑卷积神经网络
IEEE Trans Neural Netw Learn Syst. 2020 Feb;31(2):574-588. doi: 10.1109/TNNLS.2019.2906563. Epub 2019 Apr 12.
9
Strategy to improve the accuracy of convolutional neural network architectures applied to digital image steganalysis in the spatial domain.提高应用于空间域数字图像隐写分析的卷积神经网络架构准确性的策略。
PeerJ Comput Sci. 2021 Apr 9;7:e451. doi: 10.7717/peerj-cs.451. eCollection 2021.
10
Acceleration of Deep Neural Network Training Using Field Programmable Gate Arrays.使用现场可编程门阵列加速深度神经网络训练。
Comput Intell Neurosci. 2022 Oct 17;2022:8387364. doi: 10.1155/2022/8387364. eCollection 2022.