• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

神经网络结构对使用GPU/TPU的卷积神经网络在图像分析中加速性能和提高准确性的影响。

Effect of neural network structure in accelerating performance and accuracy of a convolutional neural network with GPU/TPU for image analytics.

作者信息

Ravikumar Aswathy, Sriraman Harini, Sai Saketh P Maruthi, Lokesh Saddikuti, Karanam Abhiram

机构信息

School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, Tamil Nadu, India.

出版信息

PeerJ Comput Sci. 2022 Mar 3;8:e909. doi: 10.7717/peerj-cs.909. eCollection 2022.

DOI:10.7717/peerj-cs.909
PMID:35494877
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9044238/
Abstract

BACKGROUND

In deep learning the most significant breakthrough in the field of image recognition, object detection language processing was done by Convolutional Neural Network (CNN). Rapid growth in data and neural networks the performance of the DNN algorithms depends on the computation power and the storage capacity of the devices.

METHODS

In this paper, the convolutional neural network used for various image applications was studied and its acceleration in the various platforms like CPU, GPU, TPU was done. The neural network structure and the computing power and characteristics of the GPU, TPU was analyzed and summarized, the effect of these on accelerating the tasks is also explained. Cross-platform comparison of the CNN was done using three image applications the face mask detection (object detection/Computer Vision), Virus Detection in Plants (Image Classification: agriculture sector), and Pneumonia detection from X-ray Images (Image Classification/medical field).

RESULTS

The CNN implementation was done and a comprehensive comparison was done on the platforms to identify the performance, throughput, bottlenecks, and training time. The CNN layer-wise execution in GPU and TPU is explained with layer-wise analysis. The impact of the fully connected layer and convolutional layer on the network is analyzed. The challenges faced during the acceleration process were discussed and future works are identified.

摘要

背景

在深度学习这一图像识别、目标检测和语言处理领域最重大的突破是由卷积神经网络(CNN)实现的。随着数据和神经网络的快速增长,深度神经网络(DNN)算法的性能取决于设备的计算能力和存储容量。

方法

本文研究了用于各种图像应用的卷积神经网络,并在CPU、GPU、TPU等各种平台上对其进行了加速。分析并总结了神经网络结构以及GPU、TPU的计算能力和特性,并解释了它们对加速任务的影响。使用面部口罩检测(目标检测/计算机视觉)、植物病毒检测(图像分类:农业领域)和X光图像肺炎检测(图像分类/医学领域)这三个图像应用对CNN进行了跨平台比较。

结果

完成了CNN的实现,并在各平台上进行了全面比较,以确定性能、吞吐量、瓶颈和训练时间。通过逐层分析解释了GPU和TPU中CNN的逐层执行情况。分析了全连接层和卷积层对网络的影响。讨论了加速过程中面临的挑战,并确定了未来的工作。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/4c9b78ee9f38/peerj-cs-08-909-g016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/f67061242a1d/peerj-cs-08-909-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/058ce1d083e5/peerj-cs-08-909-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/a44e625838bf/peerj-cs-08-909-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/d7212284d2ea/peerj-cs-08-909-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/7f09c954cd5f/peerj-cs-08-909-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/ea5badd7d37d/peerj-cs-08-909-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/c600c68ce7cb/peerj-cs-08-909-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/1bbe1094134a/peerj-cs-08-909-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/6dc9e0853d53/peerj-cs-08-909-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/3096bb38b07b/peerj-cs-08-909-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/c9ab76406b48/peerj-cs-08-909-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/756da5039c5b/peerj-cs-08-909-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/c30fbb650b74/peerj-cs-08-909-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/5314754aa8cc/peerj-cs-08-909-g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/1150571c8214/peerj-cs-08-909-g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/4c9b78ee9f38/peerj-cs-08-909-g016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/f67061242a1d/peerj-cs-08-909-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/058ce1d083e5/peerj-cs-08-909-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/a44e625838bf/peerj-cs-08-909-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/d7212284d2ea/peerj-cs-08-909-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/7f09c954cd5f/peerj-cs-08-909-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/ea5badd7d37d/peerj-cs-08-909-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/c600c68ce7cb/peerj-cs-08-909-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/1bbe1094134a/peerj-cs-08-909-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/6dc9e0853d53/peerj-cs-08-909-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/3096bb38b07b/peerj-cs-08-909-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/c9ab76406b48/peerj-cs-08-909-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/756da5039c5b/peerj-cs-08-909-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/c30fbb650b74/peerj-cs-08-909-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/5314754aa8cc/peerj-cs-08-909-g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/1150571c8214/peerj-cs-08-909-g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93d9/9044238/4c9b78ee9f38/peerj-cs-08-909-g016.jpg

相似文献

1
Effect of neural network structure in accelerating performance and accuracy of a convolutional neural network with GPU/TPU for image analytics.神经网络结构对使用GPU/TPU的卷积神经网络在图像分析中加速性能和提高准确性的影响。
PeerJ Comput Sci. 2022 Mar 3;8:e909. doi: 10.7717/peerj-cs.909. eCollection 2022.
2
White blood cells detection and classification based on regional convolutional neural networks.基于区域卷积神经网络的白细胞检测与分类。
Med Hypotheses. 2020 Feb;135:109472. doi: 10.1016/j.mehy.2019.109472. Epub 2019 Nov 4.
3
CNN-LRP: Understanding Convolutional Neural Networks Performance for Target Recognition in SAR Images.CNN-LRP:理解卷积神经网络在 SAR 图像目标识别中的性能。
Sensors (Basel). 2021 Jul 1;21(13):4536. doi: 10.3390/s21134536.
4
A Novel Memory-Scheduling Strategy for Large Convolutional Neural Network on Memory-Limited Devices.一种针对内存受限设备的大型卷积神经网络的新型内存调度策略。
Comput Intell Neurosci. 2019 Apr 28;2019:4328653. doi: 10.1155/2019/4328653. eCollection 2019.
5
Using Deep Learning and Low-Cost RGB and Thermal Cameras to Detect Pedestrians in Aerial Images Captured by Multirotor UAV.利用深度学习以及低成本的 RGB 和热成像摄像机,检测多旋翼无人机航拍图像中的行人。
Sensors (Basel). 2018 Jul 12;18(7):2244. doi: 10.3390/s18072244.
6
An improved human activity recognition technique based on convolutional neural network.基于卷积神经网络的改进型人体活动识别技术。
Sci Rep. 2023 Dec 19;13(1):22581. doi: 10.1038/s41598-023-49739-1.
7
Acceleration of Deep Neural Network Training Using Field Programmable Gate Arrays.使用现场可编程门阵列加速深度神经网络训练。
Comput Intell Neurosci. 2022 Oct 17;2022:8387364. doi: 10.1155/2022/8387364. eCollection 2022.
8
An Interactive Visualization for Feature Localization in Deep Neural Networks.一种用于深度神经网络中特征定位的交互式可视化方法。
Front Artif Intell. 2020 Jul 23;3:49. doi: 10.3389/frai.2020.00049. eCollection 2020.
9
Alcoholism Detection by Data Augmentation and Convolutional Neural Network with Stochastic Pooling.基于数据增强和带有随机池化卷积神经网络的酗酒检测
J Med Syst. 2017 Nov 17;42(1):2. doi: 10.1007/s10916-017-0845-x.
10
Automatic recognition of holistic functional brain networks using iteratively optimized convolutional neural networks (IO-CNN) with weak label initialization.利用具有弱标签初始化的迭代优化卷积神经网络(IO-CNN)自动识别整体功能脑网络。
Med Image Anal. 2018 Jul;47:111-126. doi: 10.1016/j.media.2018.04.002.

引用本文的文献

1
Novel Snapshot-Based Hyperspectral Conversion for Dermatological Lesion Detection via YOLO Object Detection Models.基于快照的新型高光谱转换技术在皮肤病灶检测中的应用——通过YOLO目标检测模型实现
Bioengineering (Basel). 2025 Jun 30;12(7):714. doi: 10.3390/bioengineering12070714.
2
A Systematic Review of Real-Time Deep Learning Methods for Image-Based Cancer Diagnostics.基于图像的癌症诊断实时深度学习方法的系统综述。
J Multidiscip Healthc. 2024 Sep 9;17:4411-4425. doi: 10.2147/JMDH.S446745. eCollection 2024.
3
DPro-SM - A distributed framework for proactive straggler mitigation using LSTM.

本文引用的文献

1
Model design and parameter optimization of CNN for side-channel cryptanalysis.用于旁道密码分析的卷积神经网络模型设计与参数优化
PeerJ Comput Sci. 2022 Jan 5;8:e829. doi: 10.7717/peerj-cs.829. eCollection 2022.
2
Deep learning-a first meta-survey of selected reviews across scientific disciplines, their commonalities, challenges and research impact.深度学习——对跨科学学科的选定综述、它们的共性、挑战及研究影响的首次元调查。
PeerJ Comput Sci. 2021 Nov 17;7:e773. doi: 10.7717/peerj-cs.773. eCollection 2021.
3
A new smart healthcare framework for real-time heart disease detection based on deep and machine learning.
DPro-SM - 一种使用长短期记忆网络(LSTM)减轻掉队者影响的分布式框架。
Heliyon. 2023 Dec 10;10(1):e23567. doi: 10.1016/j.heliyon.2023.e23567. eCollection 2024 Jan 15.
4
Predicting medical device failure: a promise to reduce healthcare facilities cost through smart healthcare management.预测医疗设备故障:通过智能医疗管理降低医疗保健机构成本的前景。
PeerJ Comput Sci. 2023 Apr 3;9:e1279. doi: 10.7717/peerj-cs.1279. eCollection 2023.
5
Real-time pneumonia prediction using pipelined spark and high-performance computing.使用流水线式Spark和高性能计算进行实时肺炎预测。
PeerJ Comput Sci. 2023 Mar 9;9:e1258. doi: 10.7717/peerj-cs.1258. eCollection 2023.
6
Light-Dermo: A Lightweight Pretrained Convolution Neural Network for the Diagnosis of Multiclass Skin Lesions.Light-Dermo:一种用于多类皮肤病变诊断的轻量级预训练卷积神经网络。
Diagnostics (Basel). 2023 Jan 19;13(3):385. doi: 10.3390/diagnostics13030385.
7
TSHVNet: Simultaneous Nuclear Instance Segmentation and Classification in Histopathological Images Based on Multiattention Mechanisms.TSHVNet:基于多重注意力机制的组织病理学图像中细胞核实例的同时分割与分类。
Biomed Res Int. 2022 Nov 22;2022:7921922. doi: 10.1155/2022/7921922. eCollection 2022.
一种基于深度学习和机器学习的用于实时心脏病检测的新型智能医疗框架。
PeerJ Comput Sci. 2021 Jul 28;7:e646. doi: 10.7717/peerj-cs.646. eCollection 2021.
4
Overview of current state of research on the application of artificial intelligence techniques for COVID-19.人工智能技术在COVID-19应用方面的研究现状综述
PeerJ Comput Sci. 2021 May 26;7:e564. doi: 10.7717/peerj-cs.564. eCollection 2021.
5
Numerical behavior of NVIDIA tensor cores.英伟达张量核的数值行为。
PeerJ Comput Sci. 2021 Feb 10;7:e330. doi: 10.7717/peerj-cs.330. eCollection 2021.
6
Deep-learning convolutional neural networks with transfer learning accurately classify COVID-19 lung infection on portable chest radiographs.采用迁移学习的深度学习卷积神经网络可在便携式胸部X光片上准确分类新型冠状病毒肺炎肺部感染情况。
PeerJ. 2020 Nov 5;8:e10309. doi: 10.7717/peerj.10309. eCollection 2020.
7
Receptive fields, binocular interaction and functional architecture in the cat's visual cortex.猫视觉皮层中的感受野、双眼相互作用及功能结构
J Physiol. 1962 Jan;160(1):106-54. doi: 10.1113/jphysiol.1962.sp006837.
8
Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position.新认知机:一种用于模式识别机制的自组织神经网络模型,不受位置移动的影响。
Biol Cybern. 1980;36(4):193-202. doi: 10.1007/BF00344251.