IBM Research-Almaden, 650 Harry Road, San Jose, CA, USA.
IBM Research-Yorktown Heights, 1101 Kitchawan Road, Yorktown Heights, NY, USA.
Nat Commun. 2022 Jun 30;13(1):3765. doi: 10.1038/s41467-022-31405-1.
Analogue memory-based deep neural networks provide energy-efficiency and per-area throughput gains relative to state-of-the-art digital counterparts such as graphics processing units. Recent advances focus largely on hardware-aware algorithmic training and improvements to circuits, architectures, and memory devices. Optimal translation of software-trained weights into analogue hardware weights-given the plethora of complex memory non-idealities-represents an equally important task. We report a generalised computational framework that automates the crafting of complex weight programming strategies to minimise accuracy degradations during inference, particularly over time. The framework is agnostic to network structure and generalises well across recurrent, convolutional, and transformer neural networks. As a highly flexible numerical heuristic, the approach accommodates arbitrary device-level complexity, making it potentially relevant for a variety of analogue memories. By quantifying the limit of achievable inference accuracy, it also enables analogue memory-based deep neural network accelerators to reach their full inference potential.
基于模拟记忆的深度神经网络相对于最先进的数字对应物(如图形处理单元)提供了节能和每面积吞吐量的优势。最近的进展主要集中在硬件感知算法训练以及对电路、架构和存储设备的改进上。将软件训练的权重转换为模拟硬件权重——考虑到大量复杂的存储非理想情况——是一项同样重要的任务。我们报告了一个通用的计算框架,该框架可以自动制作复杂的权重编程策略,以最大限度地减少推理过程中的精度下降,特别是随着时间的推移。该框架与网络结构无关,并且可以很好地推广到递归、卷积和变压器神经网络。作为一种高度灵活的数值启发式方法,该方法可以适应任意设备级别的复杂性,因此对于各种模拟存储器都可能具有相关性。通过量化可实现的推理精度的极限,它还可以使基于模拟存储器的深度神经网络加速器充分发挥其推理潜力。