一种基于注意力机制的高效YOLO算法，用于基于视觉的缺陷检测并部署在FPGA上。

An Efficient YOLO Algorithm with an Attention Mechanism for Vision-Based Defect Inspection Deployed on FPGA.

作者信息

Yu Longzhen, Zhu Jianhua, Zhao Qian, Wang Zhixian

机构信息

College of Economics and Management, Qingdao University of Science and Technology, Qingdao 266000, China.

Department of Creative Informatics, Kyushu Institute of Technology, Fukuoka 804-8550, Japan.

出版信息

Micromachines (Basel). 2022 Jun 30;13(7):1058. doi: 10.3390/mi13071058.

DOI:10.3390/mi13071058

PMID:35888875

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9323378/

Abstract

Industry 4.0 features intelligent manufacturing. Among them, the vision-based defect inspection algorithm is remarkable for quality control in parts manufacturing. With the help of AI and machine learning, auto-adaptive instead of manual operation is achievable in this field, and much progress has been made in recent years. In this study, considering the demand of inspection features in industrialization, we made further improvement in smart defect inspection. An efficient algorithm using Field Programmable Gate Array (FPGA)-accelerated You Only Look Once (YOLO) v3 based on an attention mechanism is proposed. First, because of the relatively fixed camera angle and defect features, an attention mechanism based on the concept of directing the focus of defect inspection is proposed. The attention mechanism consists of three improvements: (a) image preprocessing, which is to tailor images for selectively concentrating on the defect relevant things. Image preprocessing mainly includes cutting, zooming and splicing, named CZS operations. (b) Tailoring the YOLOv3 backbone network, which is to ignore invalid inspection regions in deep neural networks and optimize the network structure. (c) Data augmentation. First, two improvements can be made to efficiently reduce deep learning operations and accelerate the inspection speed, but the preprocessed images are similar and the lack of diversity will reduce network accuracy. So, (c) is added to mitigate the lack of considerable amounts of training data. Second, the algorithm is deployed on a PYNQ-Z2 FPGA board to meet the industrialization production requirements for accuracy, efficiency and extensibility. FPGA can provide a low-latency, low-cost, high-power-efficiency and flexible architecture that enables deep learning acceleration for industrial scenarios. A Xilinx Deep Neural Network Development Kit (DNNDK) converted the improved YOLOv3 to Programmable Logic (PL), which can be deployed on FPGA. The conversion process mainly consists of pruning, quantization and compilation. Experimental results showed that the algorithm had high efficiency, inspection accuracy reached 99.2%, processing speed reached 1.54 Frames per Second (FPS), and power consumption was only 10 W.

摘要

工业4.0的特点是智能制造。其中，基于视觉的缺陷检测算法在零件制造的质量控制方面表现突出。借助人工智能和机器学习，该领域可实现自动自适应而非人工操作，并且近年来已取得了很大进展。在本研究中，考虑到工业化中检测特征的需求，我们对智能缺陷检测进行了进一步改进。提出了一种基于注意力机制的、使用现场可编程门阵列（FPGA）加速的你只看一次（YOLO）v3的高效算法。首先，由于相机角度和缺陷特征相对固定，提出了一种基于引导缺陷检测焦点概念的注意力机制。该注意力机制包括三项改进：（a）图像预处理，即对图像进行裁剪以选择性地聚焦于与缺陷相关的事物。图像预处理主要包括裁剪、缩放和拼接，称为CZS操作。（b）裁剪YOLOv3主干网络，即在深度神经网络中忽略无效检测区域并优化网络结构。（c）数据增强。首先，前两项改进可有效减少深度学习操作并加快检测速度，但预处理后的图像相似且缺乏多样性会降低网络精度。因此，添加（c）以缓解训练数据量不足的问题。其次，将该算法部署在PYNQ-Z2 FPGA板上，以满足工业化生产对精度、效率和可扩展性的要求。FPGA可提供低延迟、低成本、高功率效率且灵活的架构，实现工业场景下的深度学习加速。赛灵思深度神经网络开发套件（DNNDK）将改进后的YOLOv3转换为可编程逻辑（PL），可部署在FPGA上。转换过程主要包括剪枝、量化和编译。实验结果表明，该算法效率高，检测准确率达到99.2%，处理速度达到每秒1.54帧（FPS），功耗仅为10瓦。