文献检索，用中文搜 PubMed

BACKGROUND

In deep learning the most significant breakthrough in the field of image recognition, object detection language processing was done by Convolutional Neural Network (CNN). Rapid growth in data and neural networks the performance of the DNN algorithms depends on the computation power and the storage capacity of the devices.

METHODS

In this paper, the convolutional neural network used for various image applications was studied and its acceleration in the various platforms like CPU, GPU, TPU was done. The neural network structure and the computing power and characteristics of the GPU, TPU was analyzed and summarized, the effect of these on accelerating the tasks is also explained. Cross-platform comparison of the CNN was done using three image applications the face mask detection (object detection/Computer Vision), Virus Detection in Plants (Image Classification: agriculture sector), and Pneumonia detection from X-ray Images (Image Classification/medical field).

RESULTS

The CNN implementation was done and a comprehensive comparison was done on the platforms to identify the performance, throughput, bottlenecks, and training time. The CNN layer-wise execution in GPU and TPU is explained with layer-wise analysis. The impact of the fully connected layer and convolutional layer on the network is analyzed. The challenges faced during the acceleration process were discussed and future works are identified.

BACKGROUND

METHODS

RESULTS

背景

在深度学习这一图像识别、目标检测和语言处理领域最重大的突破是由卷积神经网络（CNN）实现的。随着数据和神经网络的快速增长，深度神经网络（DNN）算法的性能取决于设备的计算能力和存储容量。

方法

本文研究了用于各种图像应用的卷积神经网络，并在CPU、GPU、TPU等各种平台上对其进行了加速。分析并总结了神经网络结构以及GPU、TPU的计算能力和特性，并解释了它们对加速任务的影响。使用面部口罩检测（目标检测/计算机视觉）、植物病毒检测（图像分类：农业领域）和X光图像肺炎检测（图像分类/医学领域）这三个图像应用对CNN进行了跨平台比较。