Nikolić Jelena, Perić Zoran, Aleksić Danijela, Tomić Stefan, Jovanović Aleksandra
Faculty of Electronic Engineering, University of Nis, Aleksandra Medvedeva 14, 18000 Nis, Serbia.
Department of Mobile Network Nis, Telekom Srbija, Vozdova 11, 18000 Nis, Serbia.
Entropy (Basel). 2021 Dec 20;23(12):1699. doi: 10.3390/e23121699.
Driven by the need for the compression of weights in neural networks (NNs), which is especially beneficial for edge devices with a constrained resource, and by the need to utilize the simplest possible quantization model, in this paper, we study the performance of three-bit post-training uniform quantization. The goal is to put various choices of the key parameter of the quantizer in question (support region threshold) in one place and provide a detailed overview of this choice's impact on the performance of post-training quantization for the MNIST dataset. Specifically, we analyze whether it is possible to preserve the accuracy of the two NN models (MLP and CNN) to a great extent with the very simple three-bit uniform quantizer, regardless of the choice of the key parameter. Moreover, our goal is to answer the question of whether it is of the utmost importance in post-training three-bit uniform quantization, as it is in quantization, to determine the optimal support region threshold value of the quantizer to achieve some predefined accuracy of the quantized neural network (QNN). The results show that the choice of the support region threshold value of the three-bit uniform quantizer does not have such a strong impact on the accuracy of the QNNs, which is not the case with two-bit uniform post-training quantization, when applied in MLP for the same classification task. Accordingly, one can anticipate that due to this special property, the post-training quantization model in question can be greatly exploited.
受神经网络(NNs)中权重压缩需求的驱动,这对资源受限的边缘设备尤为有益,同时也受使用尽可能简单的量化模型的需求驱动,在本文中,我们研究了三位训练后均匀量化的性能。目标是将量化器关键参数(支持区域阈值)的各种选择集中在一起,并详细概述这种选择对MNIST数据集训练后量化性能的影响。具体而言,我们分析了使用非常简单的三位均匀量化器,无论关键参数如何选择,是否有可能在很大程度上保持两个NN模型(多层感知器和卷积神经网络)的准确性。此外,我们的目标是回答在训练后三位均匀量化中,确定量化器的最佳支持区域阈值以实现量化神经网络(QNN)的某些预定义准确性是否与量化中一样至关重要的问题。结果表明,三位均匀量化器的支持区域阈值选择对QNN的准确性没有如此强烈的影响,而在相同分类任务中应用于多层感知器的两位均匀训练后量化则并非如此。因此,可以预期,由于这种特殊性质,所讨论的训练后量化模型可以得到极大利用。