• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于现场可编程门阵列的混合精度权重网络。

Mixed-precision weights network for field-programmable gate array.

机构信息

Graduate School of Life Science and Systems Engineering, Kyushu Institute of Technology, Kitakyushu, Fukuoka, Japan.

Research Center for Neuromorphic AI Hardware, Kyushu Institute of Technology, Kitakyushu, Fukuoka, Japan.

出版信息

PLoS One. 2021 May 10;16(5):e0251329. doi: 10.1371/journal.pone.0251329. eCollection 2021.

DOI:10.1371/journal.pone.0251329
PMID:33970965
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8109814/
Abstract

In this study, we introduced a mixed-precision weights network (MPWN), which is a quantization neural network that jointly utilizes three different weight spaces: binary {-1,1}, ternary {-1,0,1}, and 32-bit floating-point. We further developed the MPWN from both software and hardware aspects. From the software aspect, we evaluated the MPWN on the Fashion-MNIST and CIFAR10 datasets. We systematized the accuracy sparsity bit score, which is a linear combination of accuracy, sparsity, and number of bits. This score allows Bayesian optimization to be used efficiently to search for MPWN weight space combinations. From the hardware aspect, we proposed XOR signed-bits to explore floating-point and binary weight spaces in the MPWN. XOR signed-bits is an efficient implementation equivalent to multiplication of floating-point and binary weight spaces. Using the concept from XOR signed bits, we also provide a ternary bitwise operation that is an efficient implementation equivalent to the multiplication of floating-point and ternary weight space. To demonstrate the compatibility of the MPWN with hardware implementation, we synthesized and implemented the MPWN in a field-programmable gate array using high-level synthesis. Our proposed MPWN implementation utilized up to 1.68-4.89 times less hardware resources depending on the type of resources than a conventional 32-bit floating-point model. In addition, our implementation reduced the latency up to 31.55 times compared to 32-bit floating-point model without optimizations.

摘要

在本研究中,我们引入了一种混合精度权重网络(MPWN),这是一种量化神经网络,它联合利用了三种不同的权重空间:二进制{-1,1}、三进制{-1,0,1}和 32 位浮点。我们从软件和硬件两个方面进一步开发了 MPWN。从软件方面,我们在 Fashion-MNIST 和 CIFAR10 数据集上评估了 MPWN。我们系统地评估了 MPWN 的准确性稀疏位评分,这是准确性、稀疏性和位数的线性组合。这个分数使得贝叶斯优化可以有效地用于搜索 MPWN 权重空间组合。从硬件方面,我们提出了 XOR 有符号位来探索 MPWN 中的浮点和二进制权重空间。XOR 有符号位是一种与浮点和二进制权重空间乘法等效的高效实现。利用 XOR 有符号位的概念,我们还提供了一种有效的三进制位运算,它与浮点和三进制权重空间的乘法等效。为了展示 MPWN 与硬件实现的兼容性,我们使用高层次综合在现场可编程门阵列中综合和实现了 MPWN。与传统的 32 位浮点模型相比,我们提出的 MPWN 实现根据资源类型最多可以节省 1.68-4.89 倍的硬件资源。此外,与没有优化的 32 位浮点模型相比,我们的实现将延迟降低了 31.55 倍。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6642/8109814/379bae412e9a/pone.0251329.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6642/8109814/23b812fa6c81/pone.0251329.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6642/8109814/ae610e39a546/pone.0251329.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6642/8109814/47e0593af0a6/pone.0251329.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6642/8109814/a448f96327be/pone.0251329.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6642/8109814/f16a02cbe007/pone.0251329.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6642/8109814/c0a08b6856c7/pone.0251329.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6642/8109814/14984d9758cc/pone.0251329.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6642/8109814/3c66620c4d77/pone.0251329.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6642/8109814/379bae412e9a/pone.0251329.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6642/8109814/23b812fa6c81/pone.0251329.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6642/8109814/ae610e39a546/pone.0251329.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6642/8109814/47e0593af0a6/pone.0251329.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6642/8109814/a448f96327be/pone.0251329.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6642/8109814/f16a02cbe007/pone.0251329.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6642/8109814/c0a08b6856c7/pone.0251329.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6642/8109814/14984d9758cc/pone.0251329.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6642/8109814/3c66620c4d77/pone.0251329.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6642/8109814/379bae412e9a/pone.0251329.g009.jpg

相似文献

1
Mixed-precision weights network for field-programmable gate array.基于现场可编程门阵列的混合精度权重网络。
PLoS One. 2021 May 10;16(5):e0251329. doi: 10.1371/journal.pone.0251329. eCollection 2021.
2
An Efficient Hardware Architecture for Template Matching-Based Spike Sorting.基于模板匹配的 Spike 排序的高效硬件架构。
IEEE Trans Biomed Circuits Syst. 2019 Jun;13(3):481-492. doi: 10.1109/TBCAS.2019.2907882. Epub 2019 Mar 27.
3
Optimal Architecture of Floating-Point Arithmetic for Neural Network Training Processors.神经网络训练处理器浮点运算的最优架构。
Sensors (Basel). 2022 Feb 6;22(3):1230. doi: 10.3390/s22031230.
4
CHARLES: A C++ fixed-point library for Photonic-Aware Neural Networks.CHARLES:用于光子感知神经网络的 C++定点库。
Neural Netw. 2023 May;162:531-540. doi: 10.1016/j.neunet.2023.03.007. Epub 2023 Mar 21.
5
Implementation of pipelined FastICA on FPGA for real-time blind source separation.基于现场可编程门阵列(FPGA)实现流水线式快速独立成分分析(FastICA)以进行实时盲源分离
IEEE Trans Neural Netw. 2008 Jun;19(6):958-70. doi: 10.1109/TNN.2007.915115.
6
On Practical Issues for Stochastic STDP Hardware With 1-bit Synaptic Weights.关于具有1位突触权重的随机STDP硬件的实际问题。
Front Neurosci. 2018 Oct 15;12:665. doi: 10.3389/fnins.2018.00665. eCollection 2018.
7
Quantization-Aware NN Layers with High-throughput FPGA Implementation for Edge AI.具有高吞吐量 FPGA 实现的量化感知神经网络层,用于边缘人工智能。
Sensors (Basel). 2023 May 11;23(10):4667. doi: 10.3390/s23104667.
8
Spiking neural networks for handwritten digit recognition-Supervised learning and network optimization.用于手写数字识别的尖峰神经网络-监督学习和网络优化。
Neural Netw. 2018 Jul;103:118-127. doi: 10.1016/j.neunet.2018.03.019. Epub 2018 Apr 6.
9
Hybrid Precision Floating-Point (HPFP) Selection to Optimize Hardware-Constrained Accelerator for CNN Training.用于优化受硬件约束的CNN训练加速器的混合精度浮点(HPFP)选择
Sensors (Basel). 2024 Mar 27;24(7):2145. doi: 10.3390/s24072145.
10
FPGA-Based Hybrid-Type Implementation of Quantized Neural Networks for Remote Sensing Applications.基于 FPGA 的量化神经网络混合式实现及其在遥感中的应用。
Sensors (Basel). 2019 Feb 22;19(4):924. doi: 10.3390/s19040924.