Stanford University, Stanford, CA, USA.
University of California San Diego, La Jolla, CA, USA.
Nature. 2022 Aug;608(7923):504-512. doi: 10.1038/s41586-022-04992-8. Epub 2022 Aug 17.
Realizing increasingly complex artificial intelligence (AI) functionalities directly on edge devices calls for unprecedented energy efficiency of edge hardware. Compute-in-memory (CIM) based on resistive random-access memory (RRAM) promises to meet such demand by storing AI model weights in dense, analogue and non-volatile RRAM devices, and by performing AI computation directly within RRAM, thus eliminating power-hungry data movement between separate compute and memory. Although recent studies have demonstrated in-memory matrix-vector multiplication on fully integrated RRAM-CIM hardware, it remains a goal for a RRAM-CIM chip to simultaneously deliver high energy efficiency, versatility to support diverse models and software-comparable accuracy. Although efficiency, versatility and accuracy are all indispensable for broad adoption of the technology, the inter-related trade-offs among them cannot be addressed by isolated improvements on any single abstraction level of the design. Here, by co-optimizing across all hierarchies of the design from algorithms and architecture to circuits and devices, we present NeuRRAM-a RRAM-based CIM chip that simultaneously delivers versatility in reconfiguring CIM cores for diverse model architectures, energy efficiency that is two-times better than previous state-of-the-art RRAM-CIM chips across various computational bit-precisions, and inference accuracy comparable to software models quantized to four-bit weights across various AI tasks, including accuracy of 99.0 percent on MNIST and 85.7 percent on CIFAR-10 image classification, 84.7-percent accuracy on Google speech command recognition, and a 70-percent reduction in image-reconstruction error on a Bayesian image-recovery task.
直接在边缘设备上实现日益复杂的人工智能 (AI) 功能需要边缘硬件具有前所未有的能效。基于电阻式随机存取存储器 (RRAM) 的计算内存 (CIM) 有望通过在密集、模拟和非易失性 RRAM 设备中存储 AI 模型权重,并直接在 RRAM 中执行 AI 计算来满足这一需求,从而消除了在单独的计算和内存之间进行数据移动所需的大量电力。尽管最近的研究已经在完全集成的 RRAM-CIM 硬件上演示了内存矩阵向量乘法,但对于 RRAM-CIM 芯片来说,同时提供高能量效率、支持多种模型的多功能性以及与软件相当的准确性仍然是一个目标。尽管效率、多功能性和准确性对于广泛采用该技术都是不可或缺的,但它们之间的相互关联的权衡不能通过在设计的任何单个抽象层次上孤立地改进来解决。在这里,我们通过在从算法和体系结构到电路和设备的所有设计层次上进行协同优化,提出了 NeuRRAM——一种基于 RRAM 的 CIM 芯片,它同时提供了在各种模型体系结构中重新配置 CIM 内核的多功能性、在各种计算位精度下比以前的最先进的 RRAM-CIM 芯片高出两倍的能效,以及与软件模型量化到四位权重相当的推断准确性,包括在各种 AI 任务中达到 99.0%的准确率,如 MNIST 上的准确率为 99.0%,CIFAR-10 图像分类上的准确率为 85.7%,Google 语音命令识别上的准确率为 84.7%,以及在贝叶斯图像恢复任务上图像重建错误减少 70%。