Graduate School of Life Science and Systems Engineering, Kyushu Institute of Technology, Kitakyushu, Fukuoka, Japan.
Research Center for Neuromorphic AI Hardware, Kyushu Institute of Technology, Kitakyushu, Fukuoka, Japan.
PLoS One. 2021 May 10;16(5):e0251329. doi: 10.1371/journal.pone.0251329. eCollection 2021.
In this study, we introduced a mixed-precision weights network (MPWN), which is a quantization neural network that jointly utilizes three different weight spaces: binary {-1,1}, ternary {-1,0,1}, and 32-bit floating-point. We further developed the MPWN from both software and hardware aspects. From the software aspect, we evaluated the MPWN on the Fashion-MNIST and CIFAR10 datasets. We systematized the accuracy sparsity bit score, which is a linear combination of accuracy, sparsity, and number of bits. This score allows Bayesian optimization to be used efficiently to search for MPWN weight space combinations. From the hardware aspect, we proposed XOR signed-bits to explore floating-point and binary weight spaces in the MPWN. XOR signed-bits is an efficient implementation equivalent to multiplication of floating-point and binary weight spaces. Using the concept from XOR signed bits, we also provide a ternary bitwise operation that is an efficient implementation equivalent to the multiplication of floating-point and ternary weight space. To demonstrate the compatibility of the MPWN with hardware implementation, we synthesized and implemented the MPWN in a field-programmable gate array using high-level synthesis. Our proposed MPWN implementation utilized up to 1.68-4.89 times less hardware resources depending on the type of resources than a conventional 32-bit floating-point model. In addition, our implementation reduced the latency up to 31.55 times compared to 32-bit floating-point model without optimizations.
在本研究中,我们引入了一种混合精度权重网络(MPWN),这是一种量化神经网络,它联合利用了三种不同的权重空间:二进制{-1,1}、三进制{-1,0,1}和 32 位浮点。我们从软件和硬件两个方面进一步开发了 MPWN。从软件方面,我们在 Fashion-MNIST 和 CIFAR10 数据集上评估了 MPWN。我们系统地评估了 MPWN 的准确性稀疏位评分,这是准确性、稀疏性和位数的线性组合。这个分数使得贝叶斯优化可以有效地用于搜索 MPWN 权重空间组合。从硬件方面,我们提出了 XOR 有符号位来探索 MPWN 中的浮点和二进制权重空间。XOR 有符号位是一种与浮点和二进制权重空间乘法等效的高效实现。利用 XOR 有符号位的概念,我们还提供了一种有效的三进制位运算,它与浮点和三进制权重空间的乘法等效。为了展示 MPWN 与硬件实现的兼容性,我们使用高层次综合在现场可编程门阵列中综合和实现了 MPWN。与传统的 32 位浮点模型相比,我们提出的 MPWN 实现根据资源类型最多可以节省 1.68-4.89 倍的硬件资源。此外,与没有优化的 32 位浮点模型相比,我们的实现将延迟降低了 31.55 倍。