Chen Jun, Liu Liang, Liu Yong, Zeng Xianfang
IEEE Trans Neural Netw Learn Syst. 2021 Mar;32(3):1067-1081. doi: 10.1109/TNNLS.2020.2980041. Epub 2021 Mar 1.
The quantized neural network (QNN) is an efficient approach for network compression and can be widely used in the implementation of field-programmable gate arrays (FPGAs). This article proposes a novel learning framework for n -bit QNNs, whose weights are constrained to the power of two. To solve the gradient vanishing problem, we propose a reconstructed gradient function for QNNs in the back-propagation algorithm that can directly get the real gradient rather than estimating an approximate gradient of the expected loss. We also propose a novel QNN structure named n -BQ-NN, which uses shift operation to replace the multiply operation and is more suitable for the inference on FPGAs. Furthermore, we also design a shift vector processing element (SVPE) array to replace all 16-bit multiplications with SHIFT operations in convolution operation on FPGAs. We also carry out comparable experiments to evaluate our framework. The experimental results show that the quantized models of ResNet, DenseNet, and AlexNet through our learning framework can achieve almost the same accuracies with the original full-precision models. Moreover, when using our learning framework to train our n -BQ-NN from scratch, it can achieve state-of-the-art results compared with typical low-precision QNNs. Experiments on Xilinx ZCU102 platform show that our n -BQ-NN with our SVPE can execute 2.9 times faster than that with the vector processing element (VPE) in inference. As the SHIFT operation in our SVPE array will not consume digital signal processing (DSP) resources on FPGAs, the experiments have shown that the use of SVPE array also reduces average energy consumption to 68.7% of the VPE array with 16 bit.
量化神经网络(QNN)是一种用于网络压缩的有效方法,可广泛应用于现场可编程门阵列(FPGA)的实现中。本文提出了一种用于n位QNN的新型学习框架,其权重被限制为2的幂。为了解决梯度消失问题,我们在反向传播算法中为QNN提出了一种重构梯度函数,该函数可以直接获得真实梯度,而不是估计期望损失的近似梯度。我们还提出了一种名为n -BQ-NN的新型QNN结构,它使用移位操作代替乘法操作,更适合在FPGA上进行推理。此外,我们还设计了一个移位向量处理元件(SVPE)阵列,以在FPGA上的卷积操作中用移位操作取代所有16位乘法。我们还进行了对比实验来评估我们的框架。实验结果表明,通过我们的学习框架得到的ResNet、DenseNet和AlexNet量化模型与原始的全精度模型几乎具有相同的准确率。此外,当使用我们的学习框架从零开始训练我们的n -BQ-NN时,与典型的低精度QNN相比,它可以取得最优的结果。在Xilinx ZCU102平台上的实验表明,我们的带有SVPE的n -BQ-NN在推理时的执行速度比带有向量处理元件(VPE)的快2.9倍。由于我们的SVPE阵列中的移位操作不会消耗FPGA上的数字信号处理(DSP)资源,实验表明,使用SVPE阵列还可将平均能耗降低至16位VPE阵列的68.7%。