Suppr超能文献

一种用于压缩深度神经网络的新型低比特量化策略。

A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks.

机构信息

College of Systems Engineering, National University of Defense Technology, Changsha 410073, China.

College of Computer, National University of Defense Technology, Changsha 410073, China.

出版信息

Comput Intell Neurosci. 2020 Feb 18;2020:7839064. doi: 10.1155/2020/7839064. eCollection 2020.

Abstract

The increase in sophistication of neural network models in recent years has exponentially expanded memory consumption and computational cost, thereby hindering their applications on ASIC, FPGA, and other mobile devices. Therefore, compressing and accelerating the neural networks are necessary. In this study, we introduce a novel strategy to train low-bit networks with weights and activations quantized by several bits and address two corresponding fundamental issues. One is to approximate activations through low-bit discretization for decreasing network computational cost and dot-product memory. The other is to specify weight quantization and update mechanism for discrete weights to avoid gradient mismatch. With quantized low-bit weights and activations, the costly full-precision operation will be replaced by shift operation. We evaluate the proposed method on common datasets, and results show that this method can dramatically compress the neural network with slight accuracy loss.

摘要

近年来,神经网络模型的复杂性不断提高,导致内存消耗和计算成本呈指数级增长,从而阻碍了它们在 ASIC、FPGA 和其他移动设备上的应用。因此,需要对神经网络进行压缩和加速。在本研究中,我们提出了一种新的策略,用于训练权重和激活量化为几个比特的低比特网络,并解决了两个相应的基本问题。一个是通过低比特离散化来近似激活,以降低网络计算成本和点积内存。另一个是指定离散权重的量化和更新机制,以避免梯度失配。通过量化的低比特权重和激活,昂贵的全精度运算将被移位运算所取代。我们在常见数据集上评估了所提出的方法,结果表明,该方法可以在略微降低精度的情况下显著压缩神经网络。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efa2/7049432/58ca85e06127/CIN2020-7839064.001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验