Department of AI Automation Robot, Daegu Catholic University, 13-13 Hayang-ro, Hayang-eup, Gyeongsan-si 38430, Gyeongsangbuk-do, Republic of Korea.
Driving Image Recognition Logic Cell, Hyundai Mobis, 17-2 Mabuk-ro 240beon-gil, Giheung-gu, Yongin-si 16891, Gyeonggi-do, Republic of Korea.
Sensors (Basel). 2023 Apr 6;23(7):3777. doi: 10.3390/s23073777.
This paper presents a method for simplifying and quantizing a deep neural network (DNN)-based object detector to embed it into a real-time edge device. For network simplification, this paper compares five methods for applying channel pruning to a residual block because special care must be taken regarding the number of channels when summing two feature maps. Based on the comparison in terms of detection performance, parameter number, computational complexity, and processing time, this paper discovers the most satisfying method on the edge device. For network quantization, this paper compares post-training quantization (PTQ) and quantization-aware training (QAT) using two datasets with different detection difficulties. This comparison shows that both approaches are recommended in the case of the easy-to-detect dataset, but QAT is preferable in the case of the difficult-to-detect dataset. Through experiments, this paper shows that the proposed method can effectively embed the DNN-based object detector into an edge device equipped with Qualcomm's QCS605 System-on-Chip (SoC), while achieving a real-time operation with more than 10 frames per second.
本文提出了一种将基于深度神经网络(DNN)的目标检测器简化和量化为实时边缘设备的方法。对于网络简化,本文比较了五种应用通道剪枝到残差块的方法,因为在将两个特征图求和时必须特别注意通道数量。基于检测性能、参数数量、计算复杂度和处理时间的比较,本文在边缘设备上发现了最令人满意的方法。对于网络量化,本文比较了使用两个具有不同检测难度的数据集的后训练量化(PTQ)和量化感知训练(QAT)。比较结果表明,对于易于检测的数据集,这两种方法都推荐使用,但对于难以检测的数据集,QAT 更为可取。通过实验,本文表明,所提出的方法可以有效地将基于 DNN 的目标检测器嵌入到配备高通公司 QCS605 片上系统(SoC)的边缘设备中,同时实现每秒超过 10 帧的实时操作。